SOTAVerified

Offline RL

Papers

Showing 51100 of 755 papers

TitleStatusHype
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion PoliciesCode1
ImagineBench: Evaluating Reinforcement Learning with Large Language Model RolloutsCode1
Agent-Controller Representations: Principled Offline RL with Rich Exogenous InformationCode1
Improving and Benchmarking Offline Reinforcement Learning AlgorithmsCode1
A Workflow for Offline Model-Free Robotic Reinforcement LearningCode1
AdaCat: Adaptive Categorical Discretization for Autoregressive ModelsCode1
Batch Exploration with Examples for Scalable Robotic Reinforcement LearningCode1
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement LearningCode1
Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive RecommendationCode1
Leveraging Demonstrations with Latent Space PriorsCode1
All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RLCode1
GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic EnvironmentsCode1
Masked Autoencoding for Scalable and Generalizable Decision MakingCode1
Federated Ensemble-Directed Offline Reinforcement LearningCode1
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement LearningCode1
Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward ShapingCode1
Efficient Reinforcement Learning Through Trajectory GenerationCode1
COMBO: Conservative Offline Model-Based Policy OptimizationCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
Efficient Offline Policy Optimization with a Learned ModelCode1
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement LearningCode1
DMC-VB: A Benchmark for Representation Learning for Control with Visual DistractorsCode1
Efficient Diffusion Policies for Offline Reinforcement LearningCode1
Discriminator-Weighted Offline Imitation Learning from Suboptimal DemonstrationsCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior RegularizationCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
Deployment-Efficient Reinforcement Learning via Model-Based Offline OptimizationCode1
Direct Preference-based Policy Optimization without Reward ModelingCode1
A Policy-Guided Imitation Approach for Offline Reinforcement LearningCode1
Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement LearningCode1
Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse ShapesCode1
Doubly Mild Generalization for Offline Reinforcement LearningCode1
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement LearningCode1
DataLight: Offline Data-Driven Traffic Signal ControlCode1
Curriculum Offline Imitation LearningCode1
Efficient Planning in a Compact Latent Action SpaceCode1
CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender SystemCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Acme: A Research Framework for Distributed Reinforcement LearningCode1
Extreme Q-Learning: MaxEnt RL without EntropyCode1
CROP: Conservative Reward for Model-based Offline Policy OptimizationCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Conservative Q-Learning for Offline Reinforcement LearningCode1
Consistency Models as a Rich and Efficient Policy Class for Reinforcement LearningCode1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
A Minimalist Approach to Offline Reinforcement LearningCode1
Show:102550
← PrevPage 2 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified