SOTAVerified

Offline RL

Papers

Showing 151200 of 755 papers

TitleStatusHype
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement LearningCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse ShapesCode1
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement LearningCode1
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement LearningCode1
CROP: Conservative Reward for Model-based Offline Policy OptimizationCode1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsCode1
Guiding Online Reinforcement Learning with Action-Free Offline PretrainingCode1
Critic Regularized RegressionCode1
Dual RL: Unification and New Methods for Reinforcement and Imitation LearningCode1
Deployment-Efficient Reinforcement Learning via Model-Based Offline OptimizationCode1
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language ModelsCode1
Latent-Variable Advantage-Weighted Policy Optimization for Offline RLCode1
MOReL : Model-Based Offline Reinforcement LearningCode1
Behavior Proximal Policy OptimizationCode1
Critic-Guided Decision Transformer for Offline Reinforcement LearningCode1
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction EstimationCode1
cosFormer: Rethinking Softmax in AttentionCode1
Reinformer: Max-Return Sequence Modeling for Offline RLCode1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing FlowsCode1
Efficient Diffusion Policies for Offline Reinforcement LearningCode1
Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in HealthcareCode1
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction EstimationCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
ImagineBench: Evaluating Reinforcement Learning with Large Language Model RolloutsCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
Masked Trajectory Models for Prediction, Representation, and ControlCode1
Masked Autoencoding for Scalable and Generalizable Decision MakingCode1
Optimal Transport for Offline Imitation LearningCode1
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement LearningCode1
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning DatasetsCode1
Discriminator-Weighted Offline Imitation Learning from Suboptimal DemonstrationsCode1
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Q-value Regularized Transformer for Offline Reinforcement LearningCode1
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare SettingsCode1
DMC-VB: A Benchmark for Representation Learning for Control with Visual DistractorsCode1
Model-Bellman Inconsistency for Model-based Offline Reinforcement LearningCode1
An Optimistic Perspective on Offline Reinforcement LearningCode1
NeoRL: A Near Real-World Benchmark for Offline Reinforcement LearningCode1
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement LearningCode1
Adversarially Trained Actor Critic for Offline Reinforcement LearningCode1
Supported Policy Optimization for Offline Reinforcement LearningCode1
Doubly Mild Generalization for Offline Reinforcement LearningCode1
Neural Laplace Control for Continuous-time Delayed SystemsCode1
ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient UpdateCode1
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning AttacksCode0
Off-policy Evaluation in Doubly Inhomogeneous EnvironmentsCode0
Show:102550
← PrevPage 4 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified