SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 51100 of 137 papers

TitleStatusHype
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion0
Distributional Reinforcement Learning with Online Risk-awareness Adaption0
Estimation and Inference in Distributional Reinforcement LearningCode0
Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning0
Deep Reinforcement Learning for Artificial Upwelling Energy Management0
Value-Distributional Model-Based Reinforcement LearningCode0
Variance Control for Distributional Reinforcement LearningCode0
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent0
Distributional Model Equivalence for Risk-Sensitive Reinforcement LearningCode0
Is Risk-Sensitive Reinforcement Learning Properly Resolved?0
Diverse Projection Ensembles for Distributional Reinforcement Learning0
PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm0
Improving the generalizability and robustness of large-scale traffic signal control0
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation0
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression0
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningCode0
One-Step Distributional Reinforcement Learning0
Policy Evaluation in Distributional LQR0
Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning0
Constrained Reinforcement Learning using Distributional Representation for Trustworthy Quadrotor UAV Tracking ControlCode0
Distributional constrained reinforcement learning for supply chain optimizationCode0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
An Analysis of Quantile Temporal-Difference Learning0
Invariance to Quantile Selection in Distributional Continuous Control0
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds0
How Does Return Distribution in Distributional Reinforcement Learning Help Optimization?0
Normality-Guided Distributional Reinforcement Learning for Continuous Control0
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning0
Risk Perspective Exploration in Distributional Reinforcement Learning0
Robust Reinforcement Learning with Distributional Risk-averse formulation0
IGN : Implicit Generative NetworksCode0
A Simulation Environment and Reinforcement Learning Method for Waste Reduction0
Interpretable Stochastic Model Predictive Control using Distributional Reinforced Estimation for Quadrotor Tracking Systems0
Distributional Reinforcement Learning for Scheduling of Chemical Production Processes0
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning0
Distributional Reinforcement Learning with Regularized Wasserstein LossCode0
On solutions of the distributional Bellman equation0
Conservative Distributional Reinforcement Learning with Safety Constraints0
Robustness and risk management via distributional dynamic programming0
Conjugated Discrete Distributions for Distributional Reinforcement LearningCode0
Two steps to risk sensitivityCode0
Distributional Reinforcement Learning for Multi-Dimensional Reward FunctionsCode0
The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning0
A Cramér Distance perspective on Quantile Regression based Distributional Reinforcement LearningCode0
Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
Distributional Reinforcement Learning with Monotonic Splines0
Distributional Perturbation for Efficient Exploration in Distributional Reinforcement Learning0
Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State ObservationsCode0
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.