SOTAVerified

Efficient Wasserstein Natural Gradients for Reinforcement Learning

2020-10-12ICLR 2021Code Available1· sign in to hype

Ted Moskovitz, Michael Arbel, Ferenc Huszar, Arthur Gretton

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL). The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. This method follows the recent theme in RL of including a divergence penalty in the objective to establish a trust region. Experiments on challenging tasks demonstrate improvements in both computational cost and performance over advanced baselines.

Tasks

Reproductions