SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 51100 of 137 papers

TitleStatusHype
Invariance to Quantile Selection in Distributional Continuous Control0
Is Risk-Sensitive Reinforcement Learning Properly Resolved?0
Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning0
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning0
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning0
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model0
Noise Distribution Decomposition based Multi-Agent Distributional Reinforcement Learning0
Non-Crossing Quantile Regression for Distributional Reinforcement Learning0
Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning0
Nonlinear Distributional Gradient Temporal-Difference Learning0
Normality-Guided Distributional Reinforcement Learning for Continuous Control0
Offline and Distributional Reinforcement Learning for Radio Resource Management0
Offline and Distributional Reinforcement Learning for Wireless Communications0
One-Step Distributional Reinforcement Learning0
On Policy Evaluation Algorithms in Distributional Reinforcement Learning0
On solutions of the distributional Bellman equation0
PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm0
PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods0
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion0
Policy Evaluation in Distributional LQR0
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence0
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation0
Risk-averse policies for natural gas futures trading using distributional reinforcement learning0
Risk Perspective Exploration in Distributional Reinforcement Learning0
Robustness and risk management via distributional dynamic programming0
Robust Probabilistic Model Checking with Continuous Reward Domains0
Robust Reinforcement Learning with Distributional Risk-averse formulation0
Safe Distributional Reinforcement Learning0
Sample-based Distributional Policy Gradient0
Second-Order Bounds for [0,1]-Valued Regression via Betting Loss0
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation0
Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces0
Statistics and Samples in Distributional Reinforcement Learning0
Stochastically Dominant Distributional Reinforcement Learning0
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning0
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning0
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation0
Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning0
The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
Uncertainty-Aware Transient Stability-Constrained Preventive Redispatch: A Distributional Reinforcement Learning Approach0
SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning0
A Comparative Analysis of Expected and Distributional Reinforcement Learning0
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning0
Adaptive Nesterov Accelerated Distributional Deep Hedging for Efficient Volatility Risk Management0
Addressing Inherent Uncertainty: Risk-Sensitive Behavior Generation for Automated Driving using Distributional Reinforcement Learning0
A Distributional Perspective on Actor-Critic Framework0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.