SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 6170 of 137 papers

TitleStatusHype
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning0
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning0
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation0
Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning0
The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
Uncertainty-Aware Transient Stability-Constrained Preventive Redispatch: A Distributional Reinforcement Learning Approach0
Distributional Perturbation for Efficient Exploration in Distributional Reinforcement Learning0
Distributional Reinforcement Learning-based Energy Arbitrage Strategies in Imbalance Settlement Mechanism0
Distributional Reinforcement Learning for Efficient Exploration0
Show:102550
← PrevPage 7 of 14Next →

No leaderboard results yet.