| FOSP: Fine-tuning Offline Safe Policy through World Models | Jul 6, 2024 | Model-based Reinforcement LearningOffline RL | —Unverified | 0 |
| Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning | Jan 14, 2022 | modelMuJoCo | —Unverified | 0 |
| From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning | Jul 17, 2025 | D4RLOffline RL | —Unverified | 0 |
| Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation | Sep 17, 2021 | Decision MakingOffline RL | —Unverified | 0 |
| End-to-end Offline Reinforcement Learning for Glycemia Control | Oct 16, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient | Dec 7, 2017 | DecoderGoal-Oriented Dialog | —Unverified | 0 |
| ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization | Oct 2, 2024 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| Enabling A Network AI Gym for Autonomous Cyber Agents | Apr 3, 2023 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL | Apr 15, 2024 | GPUOffline RL | —Unverified | 0 |
| Augmenting Offline RL with Unlabeled Data | Jun 11, 2024 | Offline RLTransfer Learning | —Unverified | 0 |
| EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL | Jul 21, 2020 | D4RLDecision Making | —Unverified | 0 |
| CLUE: Calibrated Latent Guidance for Offline Reinforcement Learning | Jun 23, 2023 | Imitation LearningOffline RL | —Unverified | 0 |
| Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only | May 22, 2025 | Imitation LearningOffline RL | —Unverified | 0 |
| A Fast Convergence Theory for Offline Decision Making | Jun 3, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning | Nov 27, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| InferNet for Delayed Reinforcement Tasks: Addressing the Temporal Credit Assignment Problem | May 2, 2021 | Atari GamesOffline RL | —Unverified | 0 |
| ChiPFormer: Transferable Chip Placement via Offline Decision Transformer | Jun 26, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Efficient Imitation Learning with Conservative World Models | May 21, 2024 | Imitation LearningOffline RL | —Unverified | 0 |
| Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings | May 13, 2021 | Offline RL | —Unverified | 0 |
| Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning | Jan 1, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Dual Generator Offline Reinforcement Learning | Nov 2, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| A Survey on Model-based Reinforcement Learning | Jun 19, 2022 | Decision Makingmodel | —Unverified | 0 |
| Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning | Oct 18, 2023 | Offline RLQuantization | —Unverified | 0 |
| Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions | Nov 29, 2021 | Contrastive LearningDecision Making | —Unverified | 0 |
| DRDT3: Diffusion-Refined Decision Test-Time Training Model | Jan 12, 2025 | D4RLOffline RL | —Unverified | 0 |