| Contrastive Example-Based Control | Jul 24, 2023 | Offline RL | CodeCode Available | 0 | 5 |
| ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems | Jul 18, 2024 | Offline RLRecommendation Systems | CodeCode Available | 0 | 5 |
| Robust Offline Reinforcement learning with Heavy-Tailed Rewards | Oct 28, 2023 | Offline RLOff-policy evaluation | CodeCode Available | 0 | 5 |
| RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning | Dec 1, 2020 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning | Nov 29, 2021 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Revisiting Bellman Errors for Offline Model Selection | Jan 31, 2023 | Atari Gamesmodel | CodeCode Available | 0 | 5 |
| Robust Reinforcement Learning Objectives for Sequential Recommender Systems | May 30, 2023 | Offline RLRecommendation Systems | CodeCode Available | 0 | 5 |
| Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction | Feb 28, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization | May 28, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Experimental evaluation of offline reinforcement learning for HVAC control in buildings | Aug 15, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 | 5 |
| Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning | Jun 14, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Q-Value Weighted Regression: Reinforcement Learning with Limited Data | Feb 12, 2021 | Atari Gamescontinuous-control | CodeCode Available | 0 | 5 |
| VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation | Feb 24, 2023 | Computational EfficiencyOffline RL | CodeCode Available | 0 | 5 |
| PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects | May 22, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Solving Offline Reinforcement Learning with Decision Tree Regression | Jan 21, 2024 | D4RLFeature Importance | CodeCode Available | 0 | 5 |
| Policy-regularized Offline Multi-objective Reinforcement Learning | Jan 4, 2024 | Multi-Objective Reinforcement LearningOffline RL | CodeCode Available | 0 | 5 |
| POPO: Pessimistic Offline Policy Optimization | Dec 26, 2020 | Offline RLQ-Learning | CodeCode Available | 0 | 5 |
| Policy Constraint by Only Support Constraint for Offline Reinforcement Learning | Mar 7, 2025 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Compositional Conservatism: A Transductive Approach in Offline Reinforcement Learning | Apr 6, 2024 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Preference-Guided Reflective Sampling for Aligning Language Models | Aug 22, 2024 | Document SummarizationInstruction Following | CodeCode Available | 0 | 5 |
| POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning | Jan 1, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL | Dec 25, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| On the Effectiveness of Offline RL for Dialogue Response Generation | Jul 23, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Active Advantage-Aligned Online Reinforcement Learning with Offline Data | Feb 11, 2025 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical Efficiency | Mar 3, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems | Mar 2, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood | Jun 10, 2025 | Computational EfficiencyD4RL | CodeCode Available | 0 | 5 |
| Off-policy Evaluation in Doubly Inhomogeneous Environments | Jun 14, 2023 | Offline RLOff-policy evaluation | CodeCode Available | 0 | 5 |
| DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty | Jun 14, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning | Jun 10, 2025 | Data Augmentationmodel | CodeCode Available | 0 | 5 |
| Offline RL With Resource Constrained Online Deployment | Oct 7, 2021 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization | Jun 18, 2025 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination | Jun 16, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| Offline Equilibrium Finding | Jul 12, 2022 | Offline RL | CodeCode Available | 0 | 5 |
| Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees | Nov 14, 2023 | Offline RL | CodeCode Available | 0 | 5 |
| Fat-to-Thin Policy Optimization: Offline RL with Sparse Policies | Jan 24, 2025 | MuJoCoOffline RL | CodeCode Available | 0 | 5 |
| Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning | Oct 16, 2023 | ChatbotOffline RL | CodeCode Available | 0 | 5 |
| NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation | Oct 30, 2024 | D4RLManagement | CodeCode Available | 0 | 5 |
| Multi-Game Decision Transformers | May 30, 2022 | Atari GamesOffline RL | CodeCode Available | 0 | 5 |
| Diffusion Models as Optimizers for Efficient Planning in Offline RL | Jul 23, 2024 | D4RLDecision Making | CodeCode Available | 0 | 5 |
| Continual Task Learning through Adaptive Policy Self-Composition | Nov 18, 2024 | Continual LearningOffline RL | CodeCode Available | 0 | 5 |
| Mutual Information Regularized Offline Reinforcement Learning | Oct 14, 2022 | D4RLOffline RL | CodeCode Available | 0 | 5 |
| RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning | Jun 24, 2020 | Atari GamesDQN Replay Dataset | CodeCode Available | 0 | 5 |
| Offline Reinforcement Learning from Datasets with Structured Non-Stationarity | May 23, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage | Oct 27, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning | Oct 2, 2021 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Two-step reinforcement learning for model-free redesign of nonlinear optimal regulator | Mar 5, 2021 | Offline RLreinforcement-learning | CodeCode Available | 0 | 5 |
| Model-based Offline Policy Optimization with Adversarial Network | Sep 5, 2023 | modelOffline RL | CodeCode Available | 0 | 5 |