SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 125 of 137 papers

TitleStatusHype
Conservative Offline Distributional Reinforcement LearningCode1
A Distributional Analogue to the Successor RepresentationCode1
Intelligent Resource Allocation in Joint Radar-Communication With Graph Neural NetworksCode1
Risk-Sensitive Policy with Distributional Reinforcement LearningCode1
Distributional Reinforcement Learning with Unconstrained Monotonic Neural NetworksCode1
Distributional Reinforcement Learning via Moment MatchingCode1
Trust Region-Based Safe Distributional Reinforcement Learning for Multiple ConstraintsCode1
Implicit Distributional Reinforcement LearningCode1
Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis TreatmentCode1
Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement LearningCode1
Gamma and Vega Hedging Using Deep Distributional Reinforcement LearningCode1
An introduction to reinforcement learning for neuroscience0
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation0
A Point-Based Algorithm for Distributional Reinforcement Learning in Partially Observable Domains0
An Analysis of Quantile Temporal-Difference Learning0
An Analysis of Categorical Distributional Reinforcement Learning0
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent0
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition0
A Local Temporal Difference Code for Distributional Reinforcement Learning0
Adaptive Nesterov Accelerated Distributional Deep Hedging for Efficient Volatility Risk Management0
Beyond Average Return in Markov Decision Processes0
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds0
Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space0
Conservative Distributional Reinforcement Learning with Safety Constraints0
A Distributional Perspective on Actor-Critic Framework0
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.