homework of CS294-112
- hw1 Behavioral-Cloning and DAgger
- hw2 Policy-Gradient
- hw3 Actor-Critic, Q-learning
- hw4 Model-Based RL
- hw5
源代码:https://github.com/berkeleydeeprlcourse/homework
参考:
https://github.com/Observerspy/CS294
https://github.com/xuwd11/cs294-112_hws