| From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning | Jul 17, 2025 | D4RLOffline RL | —Unverified | 0 |
| Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs | Jul 15, 2025 | DiversityMMLU | CodeCode Available | 0 |
| Robust Bandwidth Estimation for Real-Time Communication with Offline Reinforcement Learning | Jul 8, 2025 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | Jun 26, 2025 | Action GenerationDecision Making | —Unverified | 0 |
| Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL | Jun 26, 2025 | Offline RL | —Unverified | 0 |
| Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity | Jun 20, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 |
| CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization | Jun 18, 2025 | D4RLOffline RL | CodeCode Available | 0 |
| IntelliLung: Advancing Safe Mechanical Ventilation using Offline RL with Hybrid Actions and Clinically Aligned Rewards | Jun 17, 2025 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Toward Explainable Offline RL: Analyzing Representations in Intrinsically Motivated Decision Transformers | Jun 16, 2025 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty | Jun 14, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 |