深度 Q 网络
基于 Q 网络的深度强化学习:DQN、Double DQN、Dueling DQN、优先经验回放和 Rainbow DQN。
Learning Objectives
1. From Tabular to Function Approximation
1.1 Why Deep Networks?
1.2 The Deadly Triad
2. DQN (Deep Q-Network)
2.1 Experience Replay
2.2 Target Network
2.3 Full Algorithm
2.4 PyTorch Implementation: CartPole
3. Double DQN
3.1 Overestimation Bias
3.2 Solution
4. Dueling DQN
4.1 Architecture: V(s) + A(s,a)
5. Prioritized Experience Replay
5.1 TD Error as Priority
6. Rainbow DQN
7. DQN for Robotics: Discrete Control
Exercises
References