SOTAVerified

Offline RL

Papers

Showing 301325 of 755 papers

TitleStatusHype
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism0
A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies0
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning0
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning0
Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task0
Diffused Task-Agnostic Milestone Planner0
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs0
Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation0
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching0
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning0
BRAC+: Going Deeper with Behavior Regularized Offline Reinforcement Learning0
Bootstrapped Transformer for Offline Reinforcement Learning0
How to Provably Improve Return Conditioned Supervised Learning?0
Boosting Offline Reinforcement Learning with Residual Generative Modeling0
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation0
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning0
Dialogue Evaluation with Offline Reinforcement Learning0
Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm0
Boosting Offline Reinforcement Learning via Data Rebalancing0
Addressing Extrapolation Error in Deep Offline Reinforcement Learning0
Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization0
Learning Dexterous Manipulation from Suboptimal Experts0
Boosting Offline Reinforcement Learning for Autonomous Driving with Hierarchical Latent Skills0
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning0
Launchpad: Learning to Schedule Using Offline and Online RL Methods0
Show:102550
← PrevPage 13 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified