SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 4150 of 137 papers

TitleStatusHype
IGN : Implicit Generative NetworksCode0
Fully Parameterized Quantile Function for Distributional Reinforcement LearningCode0
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningCode0
Deep Distributional Learning with Non-crossing Quantile Network0
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition0
A Point-Based Algorithm for Distributional Reinforcement Learning in Partially Observable Domains0
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent0
An introduction to reinforcement learning for neuroscience0
Controlling Synthetic Characters in Simulations: A Case for Cognitive Architectures and Sigma0
Distributional Reinforcement Learning with Ensembles0
Show:102550
← PrevPage 5 of 14Next →

No leaderboard results yet.