This repository consists of a gymnasium environment which implements a reinforcement learning environment to facilitate training controllers for balancing inverted pendulum on a quadruped.
Currently, only Unitree's Go2 with TD3 (Twin Delayed DDPG) algorithm is supported out-of-the-box. The codebase follows modular structure and it is easy to add other models and algorithms.
This repository uses stable_baselines3
with changes made to make it work along with gymnasium v1.0.0
(refer here for changes).
You may use my fork in the meanwhile.
## install pre-requisites, it would be much easier to install it under seperate virtual environment to avoid clashing gymnasium versions
# to create a new virtualenv: python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
git clone https://github.com/pulak-gautam/quadruped-pend-gym && cd quadruped-pend-gym
pip install -e .
# run example standup script
python3 tests/test_quad_standup.py
# train td3 agent for balancing inverted pend
python3 scripts/train_td3.py
# evaluate td3 agent for balancing inverted pend, make sure to change
python3 scripts/eval_td3.py --model_path <model-path>
- TD3 implementation has been heavily derived from cleanrl
- Reward shaping has been inspired by legged_gym