| Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning | Oct 25, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 |
| Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes | Oct 12, 2021 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Leveraging Demonstrations with Latent Space Priors | Oct 26, 2022 | Offline RL | CodeCode Available | 1 |
| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search | May 24, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning | May 7, 2024 | Offline RLRobot Manipulation | CodeCode Available | 1 |
| Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting | Jun 22, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning | Jul 1, 2023 | D4RLmodel | CodeCode Available | 1 |
| Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization | Jun 5, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| MOPO: Model-based Offline Policy Optimization | May 27, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| NeoRL-2: Near Real-World Benchmarks for Offline Reinforcement Learning with Extended Realistic Scenarios | Mar 25, 2025 | BenchmarkingOffline RL | CodeCode Available | 1 |
| Neural Laplace Control for Continuous-time Delayed Systems | Feb 24, 2023 | Model Predictive ControlOffline RL | CodeCode Available | 1 |
| Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL | May 28, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Offline Meta-Reinforcement Learning with Advantage Weighting | Aug 13, 2020 | Machine TranslationMeta-Learning | CodeCode Available | 1 |
| CROP: Conservative Reward for Model-based Offline Policy Optimization | Oct 26, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Critic Regularized Regression | Jun 26, 2020 | Offline RLregression | CodeCode Available | 1 |
| Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 Diabetes | Apr 7, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Offline Reinforcement Learning for Visual Navigation | Dec 16, 2022 | NavigateOffline RL | CodeCode Available | 1 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Offline Reinforcement Learning with In-sample Q-Learning | Sep 29, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Behavior Proximal Policy Optimization | Feb 22, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Offline Reinforcement Learning with Reverse Model-based Imagination | Oct 1, 2021 | Data Augmentationmodel | CodeCode Available | 1 |
| Critic-Guided Decision Transformer for Offline Reinforcement Learning | Dec 21, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning | Sep 22, 2023 | counterfactualMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 |
| Online and Offline Reinforcement Learning by Planning with a Learned Model | Apr 13, 2021 | Atari GamesContinuous Control | CodeCode Available | 1 |
| Online reinforcement learning with sparse rewards through an active inference capsule | Jun 4, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Federated Ensemble-Directed Offline Reinforcement Learning | May 4, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Direct Preference-based Policy Optimization without Reward Modeling | Jan 30, 2023 | Contrastive LearningOffline RL | CodeCode Available | 1 |
| PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning | Dec 26, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations | Jul 20, 2022 | Imitation LearningOffline RL | CodeCode Available | 1 |
| When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning | May 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 |
| PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer | Jun 10, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |
| Policy Regularization with Dataset Constraint for Offline Reinforcement Learning | Jun 11, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward Shaping | Sep 15, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | Apr 19, 2022 | Offline RLOff-policy evaluation | CodeCode Available | 1 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 |
| Adversarially Trained Actor Critic for Offline Reinforcement Learning | Feb 5, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| MoCoDA: Model-based Counterfactual Data Augmentation | Oct 20, 2022 | counterfactualData Augmentation | CodeCode Available | 1 |
| Doubly Mild Generalization for Offline Reinforcement Learning | Nov 12, 2024 | MuJoCoOffline RL | CodeCode Available | 1 |
| Reinformer: Max-Return Sequence Modeling for Offline RL | May 14, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning | Apr 26, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning | Feb 6, 2025 | Dataset GenerationMuJoCo | —Unverified | 0 |
| Contrastive Value Learning: Implicit Models for Simple Offline RL | Nov 3, 2022 | continuous-controlContinuous Control | —Unverified | 0 |