SOTAVerified

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation

2022-02-28Unverified0· sign in to hype

Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study the convergence of the actor-critic algorithm with nonlinear function approximation under a nonconvex-nonconcave primal-dual formulation. Stochastic gradient descent ascent is applied with an adaptive proximal term for robust learning rates. We show the first efficient convergence result with primal-dual actor-critic with a convergence rate of O( (N d G^2 )N) under Markovian sampling, where G is the element-wise maximum of the gradient, N is the number of iterations, and d is the dimension of the gradient. Our result is presented with only the Polyak-ojasiewicz condition for the dual variables, which is easy to verify and applicable to a wide range of reinforcement learning (RL) scenarios. The algorithm and analysis are general enough to be applied to other RL settings, like multi-agent RL. Empirical results on OpenAI Gym continuous control tasks corroborate our theoretical findings.

Tasks

Reproductions