SOTAVerified

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

2020-12-01NeurIPS 2020Unverified0· sign in to hype

Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Temporal-difference and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks. At the core of their empirical successes is the learned feature representation, which embeds rich observations, e.g., images and texts, into the latent space that encodes semantic structures. Meanwhile, the evolution of such a feature representation is crucial to the convergence of temporal-difference and Q-learning.

Tasks

Reproductions