SOTAVerified

Offline RL

Papers

Showing 181190 of 755 papers

TitleStatusHype
Stabilizing Extreme Q-learning by Maclaurin ExpansionCode0
Strategically Conservative Q-LearningCode1
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models0
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning0
A Fast Convergence Theory for Offline Decision Making0
Causal prompting model-based offline reinforcement learning0
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory0
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning0
Reinforcement Learning in Dynamic Treatment Regimes Needs Critical ReexaminationCode1
Show:102550
← PrevPage 19 of 76Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified