SOTAVerified

Offline RL

Papers

Showing 401450 of 755 papers

TitleStatusHype
Enhancing Reinforcement Learning Through Guided Search0
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles0
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning0
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning0
Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning0
Equivariant Offline Reinforcement Learning0
Evaluation of Active Feature Acquisition Methods for Static Feature Settings0
Evaluation-Time Policy Switching for Offline Reinforcement Learning0
Exclusively Penalized Q-learning for Offline Reinforcement Learning0
Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations0
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study0
A Tractable Inference Perspective of Offline RL0
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL0
Federated Offline Reinforcement Learning0
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices0
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching0
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting0
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions0
Finetuning Offline World Models in the Real World0
Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback0
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning0
FOSP: Fine-tuning Offline Safe Policy through World Models0
From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning0
Uncertainty Estimation Using Riemannian Model~Dynamics for Offline Reinforcement Learning0
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly0
Generative Probabilistic Planning for Optimizing Supply Chain Networks0
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning0
Goal-Conditioned Data Augmentation for Offline Reinforcement Learning0
Learning Goal-Conditioned Policies from Sub-Optimal Offline Data via Metric Learning0
Goal-Conditioned Predictive Coding for Offline Reinforcement Learning0
Graph Decision Transformer0
GriddlyJS: A Web IDE for Reinforcement Learning0
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning0
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps0
Harnessing Density Ratios for Online Reinforcement Learning0
H-GAP: Humanoid Control with a Generalist Planner0
How to Leverage Unlabeled Data in Offline Reinforcement Learning0
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation0
Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance0
Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier0
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs0
Hyperparameter Selection for Offline Reinforcement Learning0
Implicit Offline Reinforcement Learning via Supervised Learning0
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning0
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback0
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning0
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization0
Improving Offline Reinforcement Learning with Inaccurate Simulators0
Improving Offline RL by Blending Heuristics0
Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions0
Show:102550
← PrevPage 9 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified