| Diffusion Models as Optimizers for Efficient Planning in Offline RL | Jul 23, 2024 | D4RLDecision Making | CodeCode Available | 0 | 5 |
| POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning | Jan 1, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Preference-Guided Reflective Sampling for Aligning Language Models | Aug 22, 2024 | Document SummarizationInstruction Following | CodeCode Available | 0 | 5 |
| Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage | Oct 27, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning | Oct 2, 2021 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| On the Effectiveness of Offline RL for Dialogue Response Generation | Jul 23, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning | Oct 9, 2023 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL | Dec 25, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy Learning | Jan 17, 2024 | Offline RLRobot Manipulation | CodeCode Available | 0 | 5 |
| Off-policy Evaluation in Doubly Inhomogeneous Environments | Jun 14, 2023 | Offline RLOff-policy evaluation | CodeCode Available | 0 | 5 |
| A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning | Jul 24, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Offline RL With Resource Constrained Online Deployment | Oct 7, 2021 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood | Jun 10, 2025 | Computational EfficiencyD4RL | CodeCode Available | 0 | 5 |
| On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical Efficiency | Mar 3, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Solving Offline Reinforcement Learning with Decision Tree Regression | Jan 21, 2024 | D4RLFeature Importance | CodeCode Available | 0 | 5 |
| Beyond Reward: Offline Preference-guided Policy Optimization | May 25, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs | Oct 18, 2020 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Decision Transformer under Random Frame Dropping | Mar 3, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL | Jun 8, 2024 | Data AugmentationMamba | CodeCode Available | 0 | 5 |
| DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning | Sep 15, 2021 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 | 5 |
| Offline Equilibrium Finding | Jul 12, 2022 | Offline RL | CodeCode Available | 0 | 5 |
| Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees | Nov 14, 2023 | Offline RL | CodeCode Available | 0 | 5 |
| Offline Reinforcement Learning from Datasets with Structured Non-Stationarity | May 23, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation | Oct 30, 2024 | D4RLManagement | CodeCode Available | 0 | 5 |
| Multi-Game Decision Transformers | May 30, 2022 | Atari GamesOffline RL | CodeCode Available | 0 | 5 |
| Mutual Information Regularized Offline Reinforcement Learning | Oct 14, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces | Feb 20, 2024 | Decision MakingOffline RL | CodeCode Available | 0 | 5 |
| Model-based Offline Reinforcement Learning with Count-based Conservatism | Jul 21, 2023 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Model-based Offline Policy Optimization with Adversarial Network | Sep 5, 2023 | modelOffline RL | CodeCode Available | 0 | 5 |
| d3rlpy: An Offline Deep Reinforcement Learning Library | Nov 6, 2021 | D4RLDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Model-Based Offline Planning with Trajectory Pruning | May 16, 2021 | modelOffline RL | CodeCode Available | 0 | 5 |
| Two-step reinforcement learning for model-free redesign of nonlinear optimal regulator | Mar 5, 2021 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator | Dec 7, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations | Mar 30, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 | 5 |
| Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief | Oct 13, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 | 5 |
| A Low Latency Adaptive Coding Spiking Framework for Deep Reinforcement Learning | Nov 21, 2022 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 | 5 |
| Learning Versatile Skills with Curriculum Masking | Oct 23, 2024 | Decision MakingOffline RL | CodeCode Available | 0 | 5 |
| Learning to Reach Goals via Diffusion | Oct 4, 2023 | Computational EfficiencyDecision Making | CodeCode Available | 0 | 5 |
| Behavior Prior Representation learning for Offline Reinforcement Learning | Nov 2, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL | May 26, 2025 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning | Dec 11, 2024 | Autonomous DrivingOffline RL | CodeCode Available | 0 | 5 |
| Corruption-Robust Offline Reinforcement Learning with General Function Approximation | Oct 23, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning from Sparse Offline Datasets via Conservative Density Estimation | Jan 16, 2024 | D4RLDensity Estimation | CodeCode Available | 0 | 5 |
| Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? | Jun 10, 2024 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 | 5 |
| Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning | Nov 29, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | May 20, 2024 | Atari GamesMamba | CodeCode Available | 0 | 5 |
| Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning | Feb 28, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning | Aug 22, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning | Jun 14, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |