Momentum-Based Policy Gradient Methods

2020-07-13ICML 2020Code Available0· sign in to hype

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Code Available — Be the first to reproduce this paper.

Code

github.com/gaosh/MBPG
OfficialIn paperpytorch★ 8

Abstract

In the paper, we propose a class of efficient momentum-based policy gradient methods for the model-free reinforcement learning, which use adaptive learning rates and do not require any large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method based on a new momentum-based variance reduced technique and the importance sampling technique. We also propose a fast Hessian-aided momentum-based policy gradient (HA-MBPG) method based on the momentum-based variance reduced technique and the Hessian-aided technique. Moreover, we prove that both the IS-MBPG and HA-MBPG methods reach the best known sample complexity of O(^-3) for finding an -stationary point of the non-concave performance function, which only require one trajectory at each iteration. In particular, we present a non-adaptive version of IS-MBPG method, i.e., IS-MBPG*, which also reaches the best known sample complexity of O(^-3) without any large batches. In the experiments, we apply four benchmark tasks to demonstrate the effectiveness of our algorithms.

Tasks

Policy Gradient Methods

Momentum-Based Policy Gradient Methods

Code

Abstract

Tasks

Reproductions