Towards Understanding Deep Policy Gradients: A Case Study on PPO

2020-12-14CUHK Course IERG5350 2020Unverified0· sign in to hype

Buhua Liu, CHONG YIN

Unverified — Be the first to reproduce this paper.

Abstract

Deep reinforcement learning has shown impressive performance on many decision-making problems, where deep policy gradient algorithms prevail in continuous action space tasks. Although many algorithm-level improvements on policy gradient algorithms have been proposed, recent studies have found that code-level optimizations also play a critical role in the claimed enhancement. In this paper, we further investigate several code-level optimizations for the popular Proximal Policy Optimization (PPO) algorithm, aiming to provide insights into the importance of different components in the practical implementations.Video presentation is available at https://youtu.be/M0uTLoEUwGQ

Tasks

Decision Making Deep Reinforcement Learning reinforcement-learning Reinforcement Learning (RL)

Towards Understanding Deep Policy Gradients: A Case Study on PPO

Abstract

Tasks

Reproductions