SOTAVerified

Offline RL

Papers

Showing 301350 of 755 papers

TitleStatusHype
A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies0
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction0
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning0
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning0
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning0
How to Provably Improve Return Conditioned Supervised Learning?0
Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task0
Diffused Task-Agnostic Milestone Planner0
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs0
Multi-Object Navigation in real environments using hybrid policies0
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching0
BRAC+: Going Deeper with Behavior Regularized Offline Reinforcement Learning0
Multi-Objective Decision Transformers for Offline Reinforcement Learning0
Leveraging Offline Data in Online Reinforcement Learning0
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains0
Bootstrapped Transformer for Offline Reinforcement Learning0
Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning0
Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning0
Addressing Extrapolation Error in Deep Offline Reinforcement Learning0
Learning to View: Decision Transformers for Active Object Detection0
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation0
Dialogue Evaluation with Offline Reinforcement Learning0
Learning to Influence Human Behavior with Offline Reinforcement Learning0
Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm0
Boosting Offline Reinforcement Learning via Data Rebalancing0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
MOReL: Model-Based Offline Reinforcement Learning0
Learning to Clarify by Reinforcement Learning Through Reward-Weighted Fine-Tuning0
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning0
Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation0
Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization0
Learning Value Functions from Undirected State-only Experience0
Boosting Offline Reinforcement Learning with Residual Generative Modeling0
Learning Dexterous Manipulation from Suboptimal Experts0
Boosting Offline Reinforcement Learning for Autonomous Driving with Hierarchical Latent Skills0
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning0
Launchpad: Learning to Schedule Using Offline and Online RL Methods0
Deploying Offline Reinforcement Learning with Human Feedback0
Leveraging Optimal Transport for Enhanced Offline Reinforcement Learning in Surgical Robotic Environments0
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning0
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble0
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning0
Language Decision Transformers with Exponential Tilt for Interactive Text Environments0
Large-Scale Retrieval for Reinforcement Learning0
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding0
Bi-Level Offline Policy Optimization with Limited Exploration0
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning0
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning0
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game0
Large Language Model driven Policy Exploration for Recommender Systems0
Show:102550
← PrevPage 7 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified