SOTAVerified

Q-learning as a monotone scheme

2024-05-30Unverified0· sign in to hype

Lingyi Yang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Stability issues with reinforcement learning methods persist. To better understand some of these stability and convergence issues involving deep reinforcement learning methods, we examine a simple linear quadratic example. We interpret the convergence criterion of exact Q-learning in the sense of a monotone scheme and discuss consequences of function approximation on monotonicity properties.

Tasks

Reproductions