A fully custom simulation environment that interchangeably compares reinforcement learning models. Has a full blown API for modularity of algorithms, built in general genetic neural network library, and other utilities for training and testing.
NOTE: If you would like to skip the write up portion of the readme, click here.
After experimenting with Fixed-Topology Neuro-Evolution within this environment, I've found that neglecting to discount detrimental fitness factors, it will fail to develop any sort of effective policy. After hours of trial and error, passing detriment values under a logarithm seems to suffice as it asserts that the detriments don't have such a serious impact on it's overall score, but rather it heavyily rewards minimizing it. You can view the actual method inside Fitness Pawn class under "calculate_fitness".
With the default hyper parameters I have set in place, on average about 1200 generations the mean seems to hover around a fitness score of ~7 with a standard deviation of around 0.5-1. However, the next 250 generations show major improvements in max generational scores: jumping to a mean of ~13, allbeit with a larger standard deviation of around 2.5-3.
After another 100 generations, it is apparent there is yet another jump in performance, from the previous mean of ~13 to a peak of ~22, where it eventually hovered for another 1000 generations making no further policy improvements.
The model seems to cap out around 23 in this simulation, however 23 is not the max fitness we've had so far, which was an unrecorded 28, but the highest recorded would be back on figure 1, where we had the big spike at 25. Further modifications & research into the policy improvement is still being done.
- Built in python3.
- Clone this repository.
git clone https://github.com/McCrearyD/ML_Arena.git
- To run any simulation, run
python3 -u main.py
inside the main directory. - Follow the terminal instructions to run any type of simulation.
- To run all test assertions & test environment, run
python3 -u test.py
inside the main directory. - To view a graph for any saved population, run
python3 -u visualize.py
inside the main directory.
- Freeplay: Freeplay allows you to create a single matchup using any type of pawn controller you'd like.
- Evolution:
- Adversarial: Train a random (or load previous) population against another.
- Other: Train a random (or load previous) population against another pawn type (ie. dynamic or brainless).
- Balance: Run a balancing simulation for pawn statistical biases. Runs
x
match iterations concurrently and reports win/loss results for each bias.
- Blue Pawn: If a pawn's color is blue, this means they have enabled their shield.
- Red Pawn: If they are red, this means their health is 0.
- Number Below Pawn: If a pawn has a number, this means they are a FitnessPawn. A fitness pawn is typically assigned to a neural network in order to judge how good they are. The higher = the better.
- Bar Above Pawn: Every pawn has a bar above their head that displays their health percentage.
- Green = +75%
- Yellow = +50%
- Red = -50%
- Red Laser: Long distance laser.
- Lighter Red: Hasn't reached the minimum distance yet, if it collides with an enemy, they won't be damaged.
- Darker Red: Fully charged laser, upon impact it will deal damage.
- Blue Laser: Short distance laser.
- Green Laser (Debug): A laser is highlighted green if it is the 'imminent laser' of the current player pawn.
Key(s) | Description | Context |
---|---|---|
W, A, S, D | Movement | Player |
LEFT, UP, RIGHT, DOWN | Directional movement | Player |
Q | Shield | Player |
SHIFT | Long-range laser | Player |
SPACE | Short-range laser | Player |
OPEN-BRACKET | Draw all pawns in every match | Global |
CLOSE-BRACKET | Show connections between all on-screen pawns | Global |
BACK-SLASH | Show pawn directional tracers | Global |
ESCAPE | End simulation, if Evolutionary: Save populations under their names | Global |
BACKSPACE | Force reset the environment. If Evolutionary: End the current generation | Global |
N | (Toggle) Visually display the currently focused creature(s) Neural Networks | Evolution |
P | (Toggle) Speed up all gameplay. Updates per frame can be changed in environment.py > class Environment > var('speed_up_cycles') |
Global |
Key(s) | Description |
---|---|
BACKSPACE | Next generation |
ESCAPE | Randomize goal location |
Custom Network Visualization | Freeplay |
Evolutionary Training | Statistical Bias Balancing |
Test Environment | Non-Graphical Simulation w/ Generational Reports |
Saved Population Visualization | |