| Q-value Regularized Transformer for Offline Reinforcement Learning | May 27, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Anti-Exploration by Random Network Distillation | Jan 31, 2023 | D4RL | CodeCode Available | 1 | 5 |
| Reasoning with Latent Diffusion in Offline Reinforcement Learning | Sep 12, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 | 5 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 | 5 |
| Reinformer: Max-Return Sequence Modeling for Offline RL | May 14, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Katakomba: Tools and Benchmarks for Data-Driven NetHack | Jun 14, 2023 | D4RLNetHack | CodeCode Available | 1 | 5 |
| Strategically Conservative Q-Learning | Jun 6, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| False Correlation Reduction for Offline Reinforcement Learning | Oct 24, 2021 | D4RLDecision Making | CodeCode Available | 1 | 5 |
| Score Regularized Policy Optimization through Diffusion Behavior | Oct 11, 2023 | D4RL | CodeCode Available | 1 | 5 |
| Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning | May 31, 2024 | D4RLReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| SEABO: A Simple Search-Based Method for Offline Imitation Learning | Feb 6, 2024 | D4RLImitation Learning | CodeCode Available | 1 | 5 |
| Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble | Oct 4, 2021 | Adroid door-clonedAdroid door-human | CodeCode Available | 1 | 5 |
| Exploration and Anti-Exploration with Distributional Random Network Distillation | Jan 18, 2024 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 | 5 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 | 5 |
| A Policy-Guided Imitation Approach for Offline Reinforcement Learning | Oct 15, 2022 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Implicit Behavioral Cloning | Sep 1, 2021 | D4RL | CodeCode Available | 1 | 5 |
| Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation | Oct 19, 2022 | D4RLMuJoCo | CodeCode Available | 0 | 5 |
| Skill Decision Transformer | Jan 31, 2023 | D4RLDescriptive | CodeCode Available | 0 | 5 |
| Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning | Jun 12, 2024 | D4RLMuJoCo | CodeCode Available | 0 | 5 |
| DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning | Oct 9, 2023 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation | May 23, 2024 | D4RLDecision Making | CodeCode Available | 0 | 5 |
| Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer | May 14, 2025 | counterfactualCounterfactual Reasoning | CodeCode Available | 0 | 5 |
| Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL | Sep 8, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization | May 28, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Decision Mamba Architectures | May 13, 2024 | D4RLImitation Learning | CodeCode Available | 0 | 5 |
| d3rlpy: An Offline Deep Reinforcement Learning Library | Nov 6, 2021 | D4RLDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Pre-training with Synthetic Data Helps Offline Reinforcement Learning | Oct 1, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model | Oct 27, 2024 | D4RLQ-Learning | CodeCode Available | 0 | 5 |
| Solving Offline Reinforcement Learning with Decision Tree Regression | Jan 21, 2024 | D4RLFeature Importance | CodeCode Available | 0 | 5 |
| Stabilizing Extreme Q-learning by Maclaurin Expansion | Jun 7, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Offline RL With Resource Constrained Online Deployment | Oct 7, 2021 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood | Jun 10, 2025 | Computational EfficiencyD4RL | CodeCode Available | 0 | 5 |
| Mutual Information Regularized Offline Reinforcement Learning | Oct 14, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation | Oct 30, 2024 | D4RLManagement | CodeCode Available | 0 | 5 |
| Model-based Offline Reinforcement Learning with Count-based Conservatism | Jul 21, 2023 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Offline Behavior Distillation | Oct 30, 2024 | D4RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning | Nov 7, 2024 | D4RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Conservative State Value Estimation for Offline Reinforcement Learning | Feb 14, 2023 | D4RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL | May 26, 2025 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination | Jun 16, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Grid-Mapping Pseudo-Count Constraint for Offline Reinforcement Learning | Apr 3, 2024 | D4RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning from Sparse Offline Datasets via Conservative Density Estimation | Jan 16, 2024 | D4RLDensity Estimation | CodeCode Available | 0 | 5 |
| A Pragmatic Look at Deep Imitation Learning | Aug 4, 2021 | Behavioural cloningD4RL | CodeCode Available | 0 | 5 |
| Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization | Oct 7, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware Perspective | Mar 12, 2024 | D4RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning on One Mode: Addressing Multi-Modality in Offline Reinforcement Learning | Dec 4, 2024 | D4RLImitation Learning | CodeCode Available | 0 | 5 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 | 5 |