| Direct Preference-based Policy Optimization without Reward Modeling | Jan 30, 2023 | Contrastive LearningOffline RL | CodeCode Available | 1 | 5 |
| Masked Autoencoding for Scalable and Generalizable Decision Making | Nov 23, 2022 | Decision MakingOffline RL | CodeCode Available | 1 | 5 |
| Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information | Oct 31, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations | Jul 20, 2022 | Imitation LearningOffline RL | CodeCode Available | 1 | 5 |
| A Workflow for Offline Model-Free Robotic Reinforcement Learning | Sep 22, 2021 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| AdaCat: Adaptive Categorical Discretization for Autoregressive Models | Aug 3, 2022 | Density EstimationOffline RL | CodeCode Available | 1 | 5 |
| Dual RL: Unification and New Methods for Reinforcement and Imitation Learning | Feb 16, 2023 | Imitation LearningOffline RL | CodeCode Available | 1 | 5 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 | 5 |
| Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation | Jul 10, 2023 | Decision MakingInteractive Recommendation | CodeCode Available | 1 | 5 |
| Improving and Benchmarking Offline Reinforcement Learning Algorithms | Jun 1, 2023 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL | Feb 24, 2022 | AllImitation Learning | CodeCode Available | 1 | 5 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| DataLight: Offline Data-Driven Traffic Signal Control | Mar 20, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization | Jun 5, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies | Apr 20, 2023 | Offline RLQ-Learning | CodeCode Available | 1 | 5 |
| Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models | May 24, 2023 | Language ModellingOffline RL | CodeCode Available | 1 | 5 |
| GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning | May 27, 2024 | Data AugmentationDecision Making | CodeCode Available | 1 | 5 |
| Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning | Sep 22, 2023 | counterfactualMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Guiding Online Reinforcement Learning with Action-Free Offline Pretraining | Jan 30, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search | May 24, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 1 | 5 |
| Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning | Oct 30, 2023 | Decision MakingOffline RL | CodeCode Available | 1 | 5 |
| GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic Environments | Feb 3, 2025 | Efficient ExplorationGraph Neural Network | CodeCode Available | 1 | 5 |
| Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting | Jun 22, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning | Dec 12, 2024 | Offline RL | CodeCode Available | 1 | 5 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 | 5 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning | Sep 29, 2023 | Image GenerationOffline RL | CodeCode Available | 1 | 5 |
| Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes | Oct 12, 2021 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning | Jun 22, 2023 | Data AugmentationOffline RL | CodeCode Available | 1 | 5 |
| CROP: Conservative Reward for Model-based Offline Policy Optimization | Oct 26, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | Apr 19, 2022 | Offline RLOff-policy evaluation | CodeCode Available | 1 | 5 |
| Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning | Oct 11, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| A Policy-Guided Imitation Approach for Offline Reinforcement Learning | Oct 15, 2022 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 | 5 |
| Critic-Guided Decision Transformer for Offline Reinforcement Learning | Dec 21, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Critic Regularized Regression | Jun 26, 2020 | Offline RLregression | CodeCode Available | 1 | 5 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Curriculum Offline Imitation Learning | Nov 3, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Federated Ensemble-Directed Offline Reinforcement Learning | May 4, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Conservative Offline Distributional Reinforcement Learning | Jul 12, 2021 | D4RLDistributional Reinforcement Learning | CodeCode Available | 1 | 5 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 | 5 |
| Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning | Dec 25, 2024 | Decision MakingOffline RL | CodeCode Available | 1 | 5 |
| ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts | May 15, 2025 | Continual LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| COMBO: Conservative Offline Model-Based Policy Optimization | Feb 16, 2021 | modelOffline RL | CodeCode Available | 1 | 5 |
| Adversarially Trained Actor Critic for Offline Reinforcement Learning | Feb 5, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Acme: A Research Framework for Distributed Reinforcement Learning | Jun 1, 2020 | Deep Reinforcement LearningDQN Replay Dataset | CodeCode Available | 1 | 5 |
| Zero-Shot Reinforcement Learning from Low Quality Data | Sep 26, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning | Jun 7, 2021 | Multi-agent Reinforcement LearningOffline RL | CodeCode Available | 1 | 5 |