On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems

2020-11-21Unverified0· sign in to hype

Raman Goyal, Suman Chakravorty, Ran Wang, Mohamed Naveed Gul Mohamed

Unverified — Be the first to reproduce this paper.

Abstract

We consider the problem of Reinforcement Learning for nonlinear stochastic dynamical systems. We show that in the RL setting, there is an inherent ``Curse of Variance" in addition to Bellman's infamous ``Curse of Dimensionality", in particular, we show that the variance in the solution grows factorial-exponentially in the order of the approximation. A fundamental consequence is that this precludes the search for anything other than ``local" feedback solutions in RL, in order to control the explosive variance growth, and thus, ensure accuracy. We further show that the deterministic optimal control has a perturbation structure, in that the higher order terms do not affect the calculation of lower order terms, which can be utilized in RL to get accurate local solutions.

Tasks

reinforcement-learning Reinforcement Learning (RL)

On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems

Abstract

Tasks

Reproductions