| Policy-regularized Offline Multi-objective Reinforcement Learning | Jan 4, 2024 | Multi-Objective Reinforcement LearningOffline RL | CodeCode Available | 0 |
| POPO: Pessimistic Offline Policy Optimization | Dec 26, 2020 | Offline RLQ-Learning | CodeCode Available | 0 |
| d3rlpy: An Offline Deep Reinforcement Learning Library | Nov 6, 2021 | D4RLDeep Reinforcement Learning | CodeCode Available | 0 |
| Preference-Guided Reflective Sampling for Aligning Language Models | Aug 22, 2024 | Document SummarizationInstruction Following | CodeCode Available | 0 |
| MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning | Jun 10, 2025 | Data Augmentationmodel | CodeCode Available | 0 |
| Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated Environments | Apr 18, 2023 | Imitation LearningOffline RL | CodeCode Available | 0 |
| Offline Equilibrium Finding | Jul 12, 2022 | Offline RL | CodeCode Available | 0 |
| A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning | Jul 24, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees | Nov 14, 2023 | Offline RL | CodeCode Available | 0 |
| NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation | Oct 30, 2024 | D4RLManagement | CodeCode Available | 0 |
| The Pump Scheduling Problem: A Real-World Scenario for Reinforcement Learning | Oct 20, 2022 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Semi-Markov Offline Reinforcement Learning for Healthcare | Mar 17, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Semi-Offline Reinforcement Learning for Optimized Text Generation | Jun 16, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 |
| DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty | Jun 14, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning | Oct 16, 2023 | ChatbotOffline RL | CodeCode Available | 0 |
| VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation | Feb 24, 2023 | Computational EfficiencyOffline RL | CodeCode Available | 0 |
| The Role of Deep Learning Regularizations on Actors in Offline RL | Sep 11, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions | Nov 1, 2024 | Bayesian InferenceOffline RL | CodeCode Available | 0 |
| Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning | Apr 10, 2023 | D4RLData Augmentation | CodeCode Available | 0 |
| Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning | Jul 3, 2021 | AttributeInductive Bias | CodeCode Available | 0 |
| PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects | May 22, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |
| Mutual Information Regularized Offline Reinforcement Learning | Oct 14, 2022 | D4RLOffline RL | CodeCode Available | 0 |
| Think-J: Learning to Think for Generative LLM-as-a-Judge | May 20, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |
| Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage | Oct 27, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |