| Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning | Oct 16, 2023 | ChatbotOffline RL | CodeCode Available | 0 |
| VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation | Feb 24, 2023 | Computational EfficiencyOffline RL | CodeCode Available | 0 |
| The Role of Deep Learning Regularizations on Actors in Offline RL | Sep 11, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions | Nov 1, 2024 | Bayesian InferenceOffline RL | CodeCode Available | 0 |
| Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning | Apr 10, 2023 | D4RLData Augmentation | CodeCode Available | 0 |
| Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning | Jul 3, 2021 | AttributeInductive Bias | CodeCode Available | 0 |
| PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects | May 22, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |
| Mutual Information Regularized Offline Reinforcement Learning | Oct 14, 2022 | D4RLOffline RL | CodeCode Available | 0 |
| Think-J: Learning to Think for Generative LLM-as-a-Judge | May 20, 2025 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |
| Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage | Oct 27, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |