SOTAVerified

Offline RL

Papers

Showing 201250 of 755 papers

TitleStatusHype
Hyperparameter Selection for Offline Reinforcement Learning0
Contrastive Learning as Goal-Conditioned Reinforcement Learning0
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning0
Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback0
BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning0
Contrastive Diffuser: Planning Towards High Return States via Contrastive Learning0
BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion0
Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models0
Contextual Transformer for Offline Meta Reinforcement Learning0
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL0
Context-Former: Stitching via Latent Conditioned Sequence Modeling0
AdaCred: Adaptive Causal Decision Transformers with Feature Crediting0
Align Your Intents: Offline Imitation Learning via Optimal Transport0
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study0
Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance0
Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations0
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning0
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation0
Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier0
Evaluation-Time Policy Switching for Offline Reinforcement Learning0
Evaluation of Active Feature Acquisition Methods for Static Feature Settings0
Equivariant Offline Reinforcement Learning0
Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning0
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning0
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning0
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning0
Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation0
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs0
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles0
Exclusively Penalized Q-learning for Offline Reinforcement Learning0
Confidence-Conditioned Value Functions for Offline Reinforcement Learning0
Enhancing Reinforcement Learning Through Guided Search0
Enhancing Pre-Trained Decision Transformers with Prompt-Tuning Bandits0
A Validation Tool for Designing Reinforcement Learning Environments0
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation0
A Tractable Inference Perspective of Offline RL0
Enhancing Offline Model-Based RL via Active Model Selection: A Bayesian Optimization Perspective0
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning0
Enhancing Cross-domain Pre-Trained Decision Transformers with Adaptive Attention0
Enhanced DACER Algorithm with High Diffusion Efficiency0
Federated Offline Reinforcement Learning0
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices0
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching0
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting0
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions0
Finetuning Offline World Models in the Real World0
Energy-Weighted Flow Matching for Offline Reinforcement Learning0
Automatic Trade-off Adaptation in Offline RL0
H-GAP: Humanoid Control with a Generalist Planner0
End-to-end Offline Reinforcement Learning for Glycemia Control0
Show:102550
← PrevPage 5 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified