| Flow Q-Learning | Feb 4, 2025 | Action GenerationD4RL | CodeCode Available | 3 |
| CORL: Research-oriented Deep Offline Reinforcement Learning Library | Oct 13, 2022 | BenchmarkingD4RL | CodeCode Available | 3 |
| Skill Expansion and Composition in Parameter Space | Feb 9, 2025 | D4RL | CodeCode Available | 2 |
| Datasets and Benchmarks for Offline Safe Reinforcement Learning | Jun 15, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning | Aug 12, 2022 | D4RLOffline RL | CodeCode Available | 2 |
| Flowformer: Linearizing Transformers with Conservation Flows | Feb 13, 2022 | D4RLOffline RL | CodeCode Available | 2 |
| Online Decision Transformer | Feb 11, 2022 | D4RLEfficient Exploration | CodeCode Available | 2 |
| Rethinking Attention with Performers | Sep 30, 2020 | D4RLImage Generation | CodeCode Available | 2 |
| D4RL: Datasets for Deep Data-Driven Reinforcement Learning | Apr 15, 2020 | D4RLOffline RL | CodeCode Available | 2 |
| Reformer: The Efficient Transformer | Jan 13, 2020 | D4RLImage Generation | CodeCode Available | 2 |
| Habitizing Diffusion Planning for Efficient and Effective Decision Making | Feb 10, 2025 | CPUD4RL | CodeCode Available | 1 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| M^3PC: Test-time Model Predictive Control for Pretrained Masked Trajectory Model | Dec 7, 2024 | D4RLmodel | CodeCode Available | 1 |
| Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control | Jul 12, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer | Jun 10, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Strategically Conservative Q-Learning | Jun 6, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning | May 31, 2024 | D4RLReinforcement Learning (RL) | CodeCode Available | 1 |
| In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought | May 31, 2024 | D4RLDecision Making | CodeCode Available | 1 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 |
| Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning | May 30, 2024 | D4RLreinforcement-learning | CodeCode Available | 1 |
| Q-value Regularized Transformer for Offline Reinforcement Learning | May 27, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| Reinformer: Max-Return Sequence Modeling for Offline RL | May 14, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| SEABO: A Simple Search-Based Method for Offline Imitation Learning | Feb 6, 2024 | D4RLImitation Learning | CodeCode Available | 1 |
| Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning | Feb 6, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| Exploration and Anti-Exploration with Distributional Random Network Distillation | Jan 18, 2024 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 |
| Critic-Guided Decision Transformer for Offline Reinforcement Learning | Dec 21, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning | Oct 27, 2023 | D4RLReinforcement Learning (RL) | CodeCode Available | 1 |
| CROP: Conservative Reward for Model-based Offline Policy Optimization | Oct 26, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias | Oct 12, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Score Regularized Policy Optimization through Diffusion Behavior | Oct 11, 2023 | D4RL | CodeCode Available | 1 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| Reasoning with Latent Diffusion in Offline Reinforcement Learning | Sep 12, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning | Jul 1, 2023 | D4RLmodel | CodeCode Available | 1 |
| Katakomba: Tools and Benchmarks for Data-Driven NetHack | Jun 14, 2023 | D4RLNetHack | CodeCode Available | 1 |
| Curricular Subgoals for Inverse Reinforcement Learning | Jun 14, 2023 | Autonomous DrivingD4RL | CodeCode Available | 1 |
| Improving and Benchmarking Offline Reinforcement Learning Algorithms | Jun 1, 2023 | AttributeBenchmarking | CodeCode Available | 1 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |
| Revisiting the Minimalist Approach to Offline Reinforcement Learning | May 16, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning | Apr 25, 2023 | D4RLImage Generation | CodeCode Available | 1 |
| Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization | Mar 28, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Optimal Transport for Offline Imitation Learning | Mar 24, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| Behavior Proximal Policy Optimization | Feb 22, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning | Feb 15, 2023 | Autonomous Drivingcontinuous-control | CodeCode Available | 1 |
| Anti-Exploration by Random Network Distillation | Jan 31, 2023 | D4RL | CodeCode Available | 1 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 |
| Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning | Oct 25, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| A Policy-Guided Imitation Approach for Offline Reinforcement Learning | Oct 15, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories | Oct 12, 2022 | D4RLOffline RL | CodeCode Available | 1 |