This code is intended mainly as proof of concept of the algorithms presented in [1]. The implementations are not particularly clear, efficient, well tested or numerically stable. We advise against using this software for nondidactic purposes.
This software is licensed under the MIT License.
- Model-based (value and policy iteration)
- Model-free (Monte Carlo, Sarsa, Q-learning and variations)
- Model-building (Dyna-Q)
See the examples directory.
[1] Sutton, R.S. and Barto, A.G. Reinforcement Learning: An Introduction. MIT Press, 1998.