SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 8190 of 137 papers

TitleStatusHype
Safe Distributional Reinforcement Learning0
Sample-based Distributional Policy Gradient0
Second-Order Bounds for [0,1]-Valued Regression via Betting Loss0
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation0
Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces0
Statistics and Samples in Distributional Reinforcement Learning0
Stochastically Dominant Distributional Reinforcement Learning0
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning0
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning0
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation0
Show:102550
← PrevPage 9 of 14Next →

No leaderboard results yet.