| Contrastive Example-Based Control | Jul 24, 2023 | Offline RL | CodeCode Available | 0 | 5 |
| VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation | Feb 24, 2023 | Computational EfficiencyOffline RL | CodeCode Available | 0 | 5 |
| Continual Task Learning through Adaptive Policy Self-Composition | Nov 18, 2024 | Continual LearningOffline RL | CodeCode Available | 0 | 5 |
| PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects | May 22, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization | May 28, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Preference-Guided Reflective Sampling for Aligning Language Models | Aug 22, 2024 | Document SummarizationInstruction Following | CodeCode Available | 0 | 5 |
| Policy Constraint by Only Support Constraint for Offline Reinforcement Learning | Mar 7, 2025 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Policy-regularized Offline Multi-objective Reinforcement Learning | Jan 4, 2024 | Multi-Objective Reinforcement LearningOffline RL | CodeCode Available | 0 | 5 |
| POPO: Pessimistic Offline Policy Optimization | Dec 26, 2020 | Offline RLQ-Learning | CodeCode Available | 0 | 5 |
| Solving Offline Reinforcement Learning with Decision Tree Regression | Jan 21, 2024 | D4RLFeature Importance | CodeCode Available | 0 | 5 |
| Compositional Conservatism: A Transductive Approach in Offline Reinforcement Learning | Apr 6, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical Efficiency | Mar 3, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| On the Effectiveness of Offline RL for Dialogue Response Generation | Jul 23, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL | Dec 25, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Off-policy Evaluation in Doubly Inhomogeneous Environments | Jun 14, 2023 | Offline RLOff-policy evaluation | CodeCode Available | 0 | 5 |
| Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood | Jun 10, 2025 | Computational EfficiencyD4RL | CodeCode Available | 0 | 5 |
| Active Advantage-Aligned Online Reinforcement Learning with Offline Data | Feb 11, 2025 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Offline RL With Resource Constrained Online Deployment | Oct 7, 2021 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems | Mar 2, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty | Jun 14, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning | Jun 10, 2025 | Data Augmentationmodel | CodeCode Available | 0 | 5 |
| Offline Reinforcement Learning from Datasets with Structured Non-Stationarity | May 23, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization | Jun 18, 2025 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination | Jun 16, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |