Concentration of Contractive Stochastic Approximation and Reinforcement Learning

2021-06-27Unverified0· sign in to hype

Siddharth Chandak, Vivek S. Borkar, Parth Dodhia

Unverified — Be the first to reproduce this paper.

Abstract

Using a martingale concentration inequality, concentration bounds `from time n_0 on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

Tasks

Q-Learning reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Concentration of Contractive Stochastic Approximation and Reinforcement Learning

Abstract

Tasks

Reproductions