SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 2130 of 137 papers

TitleStatusHype
Beyond Average Return in Markov Decision Processes0
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds0
Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space0
Conservative Distributional Reinforcement Learning with Safety Constraints0
A Distributional Perspective on Actor-Critic Framework0
Controlling Synthetic Characters in Simulations: A Case for Cognitive Architectures and Sigma0
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent0
An introduction to reinforcement learning for neuroscience0
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition0
Deep Distributional Learning with Non-crossing Quantile Network0
Show:102550
← PrevPage 3 of 14Next →

No leaderboard results yet.