SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 5175 of 137 papers

TitleStatusHype
Invariance to Quantile Selection in Distributional Continuous Control0
Is Risk-Sensitive Reinforcement Learning Properly Resolved?0
Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning0
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning0
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning0
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model0
Noise Distribution Decomposition based Multi-Agent Distributional Reinforcement Learning0
Non-Crossing Quantile Regression for Distributional Reinforcement Learning0
Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning0
Nonlinear Distributional Gradient Temporal-Difference Learning0
Normality-Guided Distributional Reinforcement Learning for Continuous Control0
Offline and Distributional Reinforcement Learning for Radio Resource Management0
Offline and Distributional Reinforcement Learning for Wireless Communications0
One-Step Distributional Reinforcement Learning0
On Policy Evaluation Algorithms in Distributional Reinforcement Learning0
On solutions of the distributional Bellman equation0
PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm0
PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods0
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion0
Policy Evaluation in Distributional LQR0
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence0
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation0
Show:102550
← PrevPage 3 of 6Next →

No leaderboard results yet.