| You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments | May 31, 2022 | Offline RLPlaying the Game of 2048 | —Unverified | 0 | 0 |
| You Only Evaluate Once: a Simple Baseline Algorithm for Offline RL | Oct 5, 2021 | D4RLOffline RL | —Unverified | 0 | 0 |
| Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization | May 19, 2025 | Offline RLPortfolio Optimization | —Unverified | 0 | 0 |
| PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation | Jun 6, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Prior-Guided Diffusion Planning for Offline Reinforcement Learning | May 16, 2025 | Decision MakingDenoising | —Unverified | 0 | 0 |
| How to Provably Improve Return Conditioned Supervised Learning? | Jun 10, 2025 | Decision MakingOffline RL | —Unverified | 0 | 0 |
| Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation | Jun 9, 2025 | Decision MakingMuJoCo | —Unverified | 0 | 0 |
| Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation | Sep 17, 2021 | Decision MakingOffline RL | —Unverified | 0 | 0 |
| Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning | Jun 1, 2023 | FairnessOffline RL | —Unverified | 0 | 0 |
| A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies | Mar 25, 2022 | Deep Reinforcement LearningOffline RL | —Unverified | 0 | 0 |
| Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning | Oct 18, 2023 | Offline RLQuantization | —Unverified | 0 | 0 |
| AdaCred: Adaptive Causal Decision Transformers with Feature Crediting | Dec 19, 2024 | AttributeImitation Learning | —Unverified | 0 | 0 |
| Adaptive Policy Learning for Offline-to-Online Reinforcement Learning | Mar 14, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Adaptive Q-learning for Interaction-Limited Reinforcement Learning | Sep 29, 2021 | Offline RLQ-Learning | —Unverified | 0 | 0 |
| Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets | Jan 1, 2021 | D4RLMuJoCo | —Unverified | 0 | 0 |
| Addressing Extrapolation Error in Deep Offline Reinforcement Learning | Jan 1, 2021 | Offline RLreinforcement-learning | —Unverified | 0 | 0 |
| ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning | May 29, 2025 | DenoisingMuJoCo | —Unverified | 0 | 0 |
| A Dual Approach to Imitation Learning from Observations with Offline Datasets | Jun 13, 2024 | Imitation LearningOffline RL | —Unverified | 0 | 0 |
| Advancing RAN Slicing with Offline Reinforcement Learning | Dec 16, 2023 | ManagementOffline RL | —Unverified | 0 | 0 |
| Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning | Jan 1, 2024 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning | Nov 27, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Align Your Intents: Offline Imitation Learning via Optimal Transport | Feb 20, 2024 | D4RLDecision Making | —Unverified | 0 | 0 |
| Task-Agnostic Learning to Accomplish New Tasks | Sep 9, 2022 | Imitation LearningOffline RL | —Unverified | 0 | 0 |
| Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning | May 3, 2025 | D4RLOffline RL | —Unverified | 0 | 0 |
| An Empirical Study of Implicit Regularization in Deep Offline RL | Jul 5, 2022 | Offline RL | —Unverified | 0 | 0 |
| An Offline Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems | Apr 19, 2024 | Efficient ExplorationMulti-Task Learning | —Unverified | 0 | 0 |
| A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning | Dec 12, 2023 | MuJoCoOffline RL | —Unverified | 0 | 0 |
| A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs | Feb 7, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 | 0 |
| ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data | Nov 8, 2022 | Offline RL | —Unverified | 0 | 0 |
| A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning | Jun 13, 2023 | D4RLEfficient Exploration | —Unverified | 0 | 0 |
| A Strong Baseline for Batch Imitation Learning | Feb 6, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Survey of Zero-shot Generalisation in Deep Reinforcement Learning | Nov 18, 2021 | Deep Reinforcement LearningOffline RL | —Unverified | 0 | 0 |
| A Survey on Model-based Reinforcement Learning | Jun 19, 2022 | Decision Makingmodel | —Unverified | 0 | 0 |
| A Fast Convergence Theory for Offline Decision Making | Jun 3, 2024 | Decision MakingOffline RL | —Unverified | 0 | 0 |
| Augmenting Offline RL with Unlabeled Data | Jun 11, 2024 | Offline RLTransfer Learning | —Unverified | 0 | 0 |
| Automatic Trade-off Adaptation in Offline RL | Jun 16, 2023 | Offline RL | —Unverified | 0 | 0 |
| A Validation Tool for Designing Reinforcement Learning Environments | Dec 10, 2021 | Offline RLreinforcement-learning | —Unverified | 0 | 0 |
| Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation | Dec 16, 2020 | Deep Reinforcement LearningDistributional Reinforcement Learning | —Unverified | 0 | 0 |
| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 | 0 |
| BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion | Jul 16, 2022 | Offline RLreinforcement-learning | —Unverified | 0 | 0 |
| BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning | Jul 15, 2024 | Model-based Reinforcement LearningOffline RL | —Unverified | 0 | 0 |
| Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning | Feb 6, 2025 | Dataset GenerationMuJoCo | —Unverified | 0 | 0 |
| Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL | Jun 16, 2021 | D4RLDomain Generalization | —Unverified | 0 | 0 |
| Behavior Regularized Offline Reinforcement Learning | Nov 26, 2019 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Behaviour Discovery and Attribution for Explainable Reinforcement Learning | Mar 19, 2025 | Offline RLreinforcement-learning | —Unverified | 0 | 0 |
| Bellman Residual Orthogonalization for Offline Reinforcement Learning | Mar 24, 2022 | Offline RLOff-policy evaluation | —Unverified | 0 | 0 |
| Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce Order Fraud Evaluation | Dec 5, 2022 | BenchmarkingBinary Classification | —Unverified | 0 | 0 |
| Benchmarks and Algorithms for Offline Preference-Based Reward Learning | Jan 3, 2023 | Active LearningOffline RL | —Unverified | 0 | 0 |
| Benchmarks for Reinforcement Learning with Biased Offline Data and Imperfect Simulators | Jun 30, 2024 | Autonomous VehiclesOffline RL | —Unverified | 0 | 0 |
| Bi-Level Offline Policy Optimization with Limited Exploration | Oct 10, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 | 0 |