Generative Adversarial Imitation Learning
Jonathan Ho, Stefano Ermon
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/Kaixhin/imitation-learningpytorch★ 563
- github.com/Div99/IQ-Learnpytorch★ 377
- github.com/twni2016/f-IRLpytorch★ 45
- github.com/ran-weii/cleanilpytorch★ 24
- github.com/emunaran/stochastic-human-driving-policies-drlpytorch★ 16
- github.com/Techget/gail-tf-sc2tf★ 7
- github.com/rohitrango/Reward-bias-in-GAILtf★ 4
- github.com/KshamaDw/collaborative-competitive-gamestf★ 0
- github.com/nav74neet/gail-tf-gymtf★ 0
- github.com/opendilab/DI-engine/blob/main/ding/reward_model/gail_irl_model.pypytorch★ 0
Abstract
Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.