SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 150 of 137 papers

TitleStatusHype
Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement LearningCode1
Implicit Distributional Reinforcement LearningCode1
Distributional Reinforcement Learning with Unconstrained Monotonic Neural NetworksCode1
Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis TreatmentCode1
Gamma and Vega Hedging Using Deep Distributional Reinforcement LearningCode1
Trust Region-Based Safe Distributional Reinforcement Learning for Multiple ConstraintsCode1
A Distributional Analogue to the Successor RepresentationCode1
Distributional Reinforcement Learning via Moment MatchingCode1
Risk-Sensitive Policy with Distributional Reinforcement LearningCode1
Conservative Offline Distributional Reinforcement LearningCode1
Intelligent Resource Allocation in Joint Radar-Communication With Graph Neural NetworksCode1
ADDQ: Adaptive Distributional Double Q-LearningCode0
Conjugated Discrete Distributions for Distributional Reinforcement LearningCode0
Distributional Reinforcement Learning with Regularized Wasserstein LossCode0
Value-Distributional Model-Based Reinforcement LearningCode0
Constrained Reinforcement Learning using Distributional Representation for Trustworthy Quadrotor UAV Tracking ControlCode0
Variance Control for Distributional Reinforcement LearningCode0
Beyond CVaR: Leveraging Static Spectral Risk Measures for Enhanced Decision-Making in Distributional Reinforcement LearningCode0
Distributional Reinforcement Learning for Multi-Dimensional Reward FunctionsCode0
RIZE: Regularized Imitation Learning via Distributional Reinforcement LearningCode0
Implicit Quantile Networks for Distributional Reinforcement LearningCode0
QUOTA: The Quantile Option Architecture for Reinforcement LearningCode0
Distributional Reinforcement Learning with Quantile RegressionCode0
Two steps to risk sensitivityCode0
GAN Q-learningCode0
A Cramér Distance perspective on Quantile Regression based Distributional Reinforcement LearningCode0
Information-Directed Exploration for Deep Reinforcement LearningCode0
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple CriticsCode0
Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination DynamicsCode0
EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement LearningCode0
Distributional Model Equivalence for Risk-Sensitive Reinforcement LearningCode0
Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State ObservationsCode0
Distributional constrained reinforcement learning for supply chain optimizationCode0
Distributional Bellman Operators over Mean EmbeddingsCode0
Echoes of Socratic Doubt: Embracing Uncertainty in Calibrated Evidential Reinforcement LearningCode0
Distributional Off-policy Evaluation with Bellman Residual MinimizationCode0
Estimating Risk and Uncertainty in Deep Reinforcement LearningCode0
Estimation and Inference in Distributional Reinforcement LearningCode0
A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement LearningCode0
Distributional Reinforcement Learning for Energy-Based Sequential ModelsCode0
IGN : Implicit Generative NetworksCode0
Fully Parameterized Quantile Function for Distributional Reinforcement LearningCode0
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningCode0
Deep Distributional Learning with Non-crossing Quantile Network0
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition0
A Point-Based Algorithm for Distributional Reinforcement Learning in Partially Observable Domains0
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent0
An introduction to reinforcement learning for neuroscience0
Controlling Synthetic Characters in Simulations: A Case for Cognitive Architectures and Sigma0
Distributional Reinforcement Learning with Ensembles0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.