Can Temporal-Diﬀerence and Q-Learning Learn Representation? A Mean-Field Theory

2020-12-01NeurIPS 2020Unverified0· sign in to hype

Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang

Unverified — Be the first to reproduce this paper.

Abstract

Temporal-diﬀerence and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks. At the core of their empirical successes is the learned feature representation, which embeds rich observations, e.g., images and texts, into the latent space that encodes semantic structures. Meanwhile, the evolution of such a feature representation is crucial to the convergence of temporal-diﬀerence and Q-learning.

Tasks

Deep Reinforcement Learning Q-Learning reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Can Temporal-Diﬀerence and Q-Learning Learn Representation? A Mean-Field Theory

Abstract

Tasks

Reproductions