SOTAVerified

Offline RL

Papers

Showing 301350 of 755 papers

TitleStatusHype
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning0
CROP: Conservative Reward for Model-based Offline Policy OptimizationCode1
Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation0
Finetuning Offline World Models in the Real World0
Corruption-Robust Offline Reinforcement Learning with General Function ApproximationCode0
Towards Robust Offline Reinforcement Learning under Diverse Data CorruptionCode1
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning0
Building Persona Consistent Dialogue Agents with Offline Reinforcement LearningCode0
End-to-end Offline Reinforcement Learning for Glycemia Control0
Leveraging Optimal Transport for Enhanced Offline Reinforcement Learning in Surgical Robotic Environments0
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration BiasCode1
Bi-Level Offline Policy Optimization with Limited Exploration0
DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement LearningCode0
Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning0
Improving Offline-to-Online Reinforcement Learning with Q Conditioned State Entropy Exploration0
Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RLCode1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsCode1
Self-Confirming Transformer for Belief-Conditioned Adaptation in Offline Multi-Agent Reinforcement Learning0
Learning to Reach Goals via DiffusionCode0
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning0
Consistency Models as a Rich and Efficient Policy Class for Reinforcement LearningCode1
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and SmoothnessCode0
Uncertainty-Aware Decision Transformer for Stochastic Driving Environments0
Zero-Shot Reinforcement Learning from Low Quality DataCode1
Boosting Offline Reinforcement Learning for Autonomous Driving with Hierarchical Latent Skills0
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps0
Robotic Offline RL from Internet Videos via Value-Function Pre-Training0
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions0
DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning0
Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning0
VAPOR: Legged Robot Navigation in Outdoor Vegetation Using Offline Reinforcement LearningCode1
Reasoning with Latent Diffusion in Offline Reinforcement LearningCode1
ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement LearningCode1
Model-based Offline Policy Optimization with Adversarial NetworkCode0
Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance0
Multi-Objective Decision Transformers for Offline Reinforcement Learning0
Reinforced Self-Training (ReST) for Language Modeling0
Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World0
Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations0
AlphaStar Unplugged: Large-Scale Offline Reinforcement LearningCode2
Integrating Offline Reinforcement Learning with Transformers for Sequential Recommendation0
Contrastive Example-Based ControlCode0
A Connection between One-Step Regularization and Critic Regularization in Reinforcement LearningCode0
On the Effectiveness of Offline RL for Dialogue Response GenerationCode0
Model-based Offline Reinforcement Learning with Count-based ConservatismCode0
PASTA: Pretrained Action-State Transformer Agents0
Towards Self-Assembling Artificial Neural Networks through Neural Developmental ProgramsCode1
Robotic Manipulation Datasets for Offline Compositional Reinforcement LearningCode1
Budgeting Counterfactual for Offline RL0
Show:102550
← PrevPage 7 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified