SOTAVerified

Distributional Reinforcement Learning

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Papers

Showing 1120 of 137 papers

TitleStatusHype
Implicit Distributional Reinforcement LearningCode1
Distributional Reinforcement Learning on Path-dependent Options0
Second-Order Bounds for [0,1]-Valued Regression via Betting Loss0
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition0
ADDQ: Adaptive Distributional Double Q-LearningCode0
A Point-Based Algorithm for Distributional Reinforcement Learning in Partially Observable Domains0
Flow Models for Unbounded and Geometry-Aware Distributional Reinforcement Learning0
Deep Distributional Learning with Non-crossing Quantile Network0
Offline and Distributional Reinforcement Learning for Wireless Communications0
RIZE: Regularized Imitation Learning via Distributional Reinforcement LearningCode0
Show:102550
← PrevPage 2 of 14Next →

No leaderboard results yet.