跳转至

深度 Q 网络

基于 Q 网络的深度强化学习:DQN、Double DQN、Dueling DQN、优先经验回放和 Rainbow DQN。

Learning Objectives

1. From Tabular to Function Approximation

1.1 Why Deep Networks?

1.2 The Deadly Triad

2. DQN (Deep Q-Network)

2.1 Experience Replay

2.2 Target Network

2.3 Full Algorithm

2.4 PyTorch Implementation: CartPole

3. Double DQN

3.1 Overestimation Bias

3.2 Solution

4. Dueling DQN

4.1 Architecture: V(s) + A(s,a)

5. Prioritized Experience Replay

5.1 TD Error as Priority

6. Rainbow DQN

7. DQN for Robotics: Discrete Control

Exercises

References