| Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning | Jan 31, 2022 | DiversityOffline RL | CodeCode Available | 1 |
| Can Wikipedia Help Offline Reinforcement Learning? | Jan 28, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| RvS: What is Essential for Offline RL via Supervised Learning? | Dec 20, 2021 | Offline RL | CodeCode Available | 1 |
| Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks | Dec 6, 2021 | AllMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification | Nov 22, 2021 | Continuous ControlMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| A Dataset Perspective on Offline Reinforcement Learning | Nov 8, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning | Nov 4, 2021 | Decision MakingImitation Learning | CodeCode Available | 1 |
| Curriculum Offline Imitation Learning | Nov 3, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 |
| False Correlation Reduction for Offline Reinforcement Learning | Oct 24, 2021 | D4RLDecision Making | CodeCode Available | 1 |
| Offline Reinforcement Learning with Value-based Episodic Memory | Oct 19, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Safe Driving via Expert Guided Policy Optimization | Oct 13, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Planning from Pixels in Environments with Combinatorially Hard Search Spaces | Oct 12, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning | Oct 12, 2021 | Imitation LearningInductive Bias | CodeCode Available | 1 |
| Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes | Oct 12, 2021 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble | Oct 4, 2021 | Adroid door-clonedAdroid door-human | CodeCode Available | 1 |
| Offline Reinforcement Learning with Reverse Model-based Imagination | Oct 1, 2021 | Data Augmentationmodel | CodeCode Available | 1 |
| Offline Reinforcement Learning with In-sample Q-Learning | Sep 29, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| A Workflow for Offline Model-Free Robotic Reinforcement Learning | Sep 22, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings | Jul 23, 2021 | Computational EfficiencyDecision Making | CodeCode Available | 1 |
| Conservative Offline Distributional Reinforcement Learning | Jul 12, 2021 | D4RLDistributional Reinforcement Learning | CodeCode Available | 1 |
| Offline Meta-Reinforcement Learning with Online Self-Supervision | Jul 8, 2021 | Meta Reinforcement LearningOffline RL | CodeCode Available | 1 |
| Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble | Jul 1, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation | Jun 21, 2021 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Offline RL Without Off-Policy Evaluation | Jun 16, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Reinforcement Learning as One Big Sequence Modeling Problem | Jun 13, 2021 | Imitation LearningOffline RL | CodeCode Available | 1 |
| A Minimalist Approach to Offline Reinforcement Learning | Jun 12, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning | Jun 7, 2021 | Multi-agent Reinforcement LearningOffline RL | CodeCode Available | 1 |
| Online reinforcement learning with sparse rewards through an active inference capsule | Jun 4, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Offline Reinforcement Learning as One Big Sequence Modeling Problem | Jun 3, 2021 | Imitation LearningOffline RL | CodeCode Available | 1 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 |
| Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning | May 17, 2021 | Offline RLQ-Learning | CodeCode Available | 1 |
| Online and Offline Reinforcement Learning by Planning with a Learned Model | Apr 13, 2021 | Atari GamesContinuous Control | CodeCode Available | 1 |
| COMBO: Conservative Offline Model-Based Policy Optimization | Feb 16, 2021 | modelOffline RL | CodeCode Available | 1 |
| NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning | Feb 1, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Offline Reinforcement Learning from Images with Latent Space Models | Dec 21, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Batch Exploration with Examples for Scalable Robotic Reinforcement Learning | Oct 22, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization | Oct 2, 2020 | Meta Reinforcement LearningMetric Learning | CodeCode Available | 1 |
| Offline Meta-Reinforcement Learning with Advantage Weighting | Aug 13, 2020 | Machine TranslationMeta-Learning | CodeCode Available | 1 |
| Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention | Jun 29, 2020 | D4RLLanguage Modelling | CodeCode Available | 1 |
| Critic Regularized Regression | Jun 26, 2020 | Offline RLregression | CodeCode Available | 1 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization | Jun 5, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Acme: A Research Framework for Distributed Reinforcement Learning | Jun 1, 2020 | Deep Reinforcement LearningDQN Replay Dataset | CodeCode Available | 1 |
| MOPO: Model-based Offline Policy Optimization | May 27, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| MOReL : Model-Based Offline Reinforcement Learning | May 12, 2020 | modelOffline RL | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Reinforcement Learning | Jul 10, 2019 | Atari GamesDiversity | CodeCode Available | 1 |
| From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning | Jul 17, 2025 | D4RLOffline RL | —Unverified | 0 |
| Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs | Jul 15, 2025 | DiversityMMLU | CodeCode Available | 0 |