Skip to content

Latest commit

 

History

History
36 lines (29 loc) · 1.69 KB

README.md

File metadata and controls

36 lines (29 loc) · 1.69 KB

Inverted Pendulum balancing on Quadruped using off-policy RL

This repository consists of a gymnasium environment which implements a reinforcement learning environment to facilitate training controllers for balancing inverted pendulum on a quadruped.

Currently, only Unitree's Go2 with TD3 (Twin Delayed DDPG) algorithm is supported out-of-the-box. The codebase follows modular structure and it is easy to add other models and algorithms.

This repository uses stable_baselines3 with changes made to make it work along with gymnasium v1.0.0 (refer here for changes). You may use my fork in the meanwhile.

Getting Started

Installation

## install pre-requisites, it would be much easier to install it under seperate virtual environment to avoid clashing gymnasium versions
# to create a new virtualenv: python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
git clone https://github.com/pulak-gautam/quadruped-pend-gym && cd quadruped-pend-gym
pip install -e .

Running

# run example standup script
python3 tests/test_quad_standup.py

# train td3 agent for balancing inverted pend
python3 scripts/train_td3.py

# evaluate td3 agent for balancing inverted pend, make sure to change 
python3 scripts/eval_td3.py --model_path <model-path>

References

  • TD3 implementation has been heavily derived from cleanrl
  • Reward shaping has been inspired by legged_gym