SOTAVerified

Offline RL

Papers

Showing 51100 of 755 papers

TitleStatusHype
In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement LearningCode1
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion PoliciesCode1
Agent-Controller Representations: Principled Offline RL with Rich Exogenous InformationCode1
Dual RL: Unification and New Methods for Reinforcement and Imitation LearningCode1
A Workflow for Offline Model-Free Robotic Reinforcement LearningCode1
AdaCat: Adaptive Categorical Discretization for Autoregressive ModelsCode1
Batch Exploration with Examples for Scalable Robotic Reinforcement LearningCode1
Masked Autoencoding for Scalable and Generalizable Decision MakingCode1
Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive RecommendationCode1
Leveraging Demonstrations with Latent Space PriorsCode1
All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RLCode1
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement LearningCode1
Efficient Reinforcement Learning Through Trajectory GenerationCode1
Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward ShapingCode1
Efficient Offline Policy Optimization with a Learned ModelCode1
Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement LearningCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
Extreme Q-Learning: MaxEnt RL without EntropyCode1
DMC-VB: A Benchmark for Representation Learning for Control with Visual DistractorsCode1
Direct Preference-based Policy Optimization without Reward ModelingCode1
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement LearningCode1
Deployment-Efficient Reinforcement Learning via Model-Based Offline OptimizationCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
Doubly Mild Generalization for Offline Reinforcement LearningCode1
Federated Ensemble-Directed Offline Reinforcement LearningCode1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
DataLight: Offline Data-Driven Traffic Signal ControlCode1
Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse ShapesCode1
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement LearningCode1
Curriculum Offline Imitation LearningCode1
Critic-Guided Decision Transformer for Offline Reinforcement LearningCode1
A Policy-Guided Imitation Approach for Offline Reinforcement LearningCode1
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
Discriminator-Weighted Offline Imitation Learning from Suboptimal DemonstrationsCode1
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
Critic Regularized RegressionCode1
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction EstimationCode1
Efficient Diffusion Policies for Offline Reinforcement LearningCode1
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior RegularizationCode1
Efficient Planning in a Compact Latent Action SpaceCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Acme: A Research Framework for Distributed Reinforcement LearningCode1
cosFormer: Rethinking Softmax in AttentionCode1
Can Wikipedia Help Offline Reinforcement Learning?Code1
Adversarially Trained Actor Critic for Offline Reinforcement LearningCode1
CROP: Conservative Reward for Model-based Offline Policy OptimizationCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
A Minimalist Approach to Offline Reinforcement LearningCode1
Show:102550
← PrevPage 2 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified