SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 3140 of 137 papers

TitleStatusHype
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
Echoes of Socratic Doubt: Embracing Uncertainty in Calibrated Evidential Reinforcement LearningCode0
Distributional Off-policy Evaluation with Bellman Residual MinimizationCode0
A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement LearningCode0
Distributional Reinforcement Learning-based Energy Arbitrage Strategies in Imbalance Settlement Mechanism0
Noise Distribution Decomposition based Multi-Agent Distributional Reinforcement Learning0
Distributional Bellman Operators over Mean EmbeddingsCode0
An introduction to reinforcement learning for neuroscience0
Beyond Average Return in Markov Decision Processes0
Show:102550
← PrevPage 4 of 14Next →

No leaderboard results yet.