模仿学习¶

从示范中学习：行为克隆、DAgger、逆强化学习、GAIL 及其在机器人中的应用。

Learning Objectives¶

1. Why Imitation Learning?¶

1.1 Reward Engineering Problem¶

1.2 Expert Demonstrations¶

2. Behavioral Cloning (BC)¶

2.1 Supervised Learning on Demonstrations¶

2.2 Distribution Shift Problem¶

2.3 PyTorch Implementation¶

3. DAgger (Dataset Aggregation)¶

3.1 Algorithm¶

3.2 When DAgger Helps¶

4. Inverse Reinforcement Learning (IRL)¶

4.1 Max-Entropy IRL¶

4.2 Apprenticeship Learning¶

5. Generative Adversarial Imitation Learning (GAIL)¶

5.1 GAN Meets RL¶

5.2 Algorithm¶

6. Diffusion Policy¶

6.1 Diffusion Models for Action Generation¶

6.2 Architecture¶

7. Robot Applications¶

Exercises¶

References¶

模仿学习¶

Learning Objectives¶

1. Why Imitation Learning?¶

1.1 Reward Engineering Problem¶

1.2 Expert Demonstrations¶

2. Behavioral Cloning (BC)¶

2.1 Supervised Learning on Demonstrations¶

2.2 Distribution Shift Problem¶

2.3 PyTorch Implementation¶

3. DAgger (Dataset Aggregation)¶

3.1 Algorithm¶

3.2 When DAgger Helps¶

4. Inverse Reinforcement Learning (IRL)¶

4.1 Max-Entropy IRL¶

4.2 Apprenticeship Learning¶

5. Generative Adversarial Imitation Learning (GAIL)¶

5.1 GAN Meets RL¶

5.2 Algorithm¶

6. Diffusion Policy¶

6.1 Diffusion Models for Action Generation¶

6.2 Architecture¶

7. Robot Applications¶

Exercises¶

References¶

Robotics Course Docs

学习路径

实践模块

社区