SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 7180 of 137 papers

TitleStatusHype
Distributional constrained reinforcement learning for supply chain optimizationCode0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
An Analysis of Quantile Temporal-Difference Learning0
Invariance to Quantile Selection in Distributional Continuous Control0
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds0
How Does Return Distribution in Distributional Reinforcement Learning Help Optimization?0
Normality-Guided Distributional Reinforcement Learning for Continuous Control0
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning0
Risk Perspective Exploration in Distributional Reinforcement Learning0
Robust Reinforcement Learning with Distributional Risk-averse formulation0
Show:102550
← PrevPage 8 of 14Next →

No leaderboard results yet.