Low-rank State-action Value-function Approximation

2021-04-18Code Available0· sign in to hype

Sergio Rozada, Victor Tenorio, Antonio G. Marques

Code Available — Be the first to reproduce this paper.

Code

github.com/sergiorozada12/low-rank-rl
OfficialIn papernone★ 1

Abstract

Value functions are central to Dynamic Programming and Reinforcement Learning but their exact estimation suffers from the curse of dimensionality, challenging the development of practical value-function (VF) estimation algorithms. Several approaches have been proposed to overcome this issue, from non-parametric schemes that aggregate states or actions to parametric approximations of state and action VFs via, e.g., linear estimators or deep neural networks. Relevantly, several high-dimensional state problems can be well-approximated by an intrinsic low-rank structure. Motivated by this and leveraging results from low-rank optimization, this paper proposes different stochastic algorithms to estimate a low-rank factorization of the Q(s, a) matrix. This is a non-parametric alternative to VF approximation that dramatically reduces the computational and sample complexities relative to classical Q-learning methods that estimate Q(s,a) separately for each state-action pair.

Tasks

Q-Learning

Low-rank State-action Value-function Approximation

Code

Abstract

Tasks

Reproductions