| AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air | Jul 15, 2025 | DenoisingSequential Decision Making | —Unverified | 0 |
| LLM-Stackelberg Games: Conjectural Reasoning Equilibria and Their Applications to Spearphishing | Jul 12, 2025 | Decision MakingMisinformation | —Unverified | 0 |
| A Survey of Continual Reinforcement Learning | Jun 27, 2025 | Continual LearningDecision Making | —Unverified | 0 |
| Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | Jun 26, 2025 | Action GenerationDecision Making | —Unverified | 0 |
| POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes | Jun 25, 2025 | Sequential Decision Making | —Unverified | 0 |
| Efficient Strategy Synthesis for MDPs via Hierarchical Block Decomposition | Jun 21, 2025 | Decision MakingSequential Decision Making | —Unverified | 0 |
| UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making | Jun 20, 2025 | Decision MakingQuestion Answering | CodeCode Available | 0 |
| Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards | Jun 20, 2025 | Decision Making Under UncertaintyMulti-Armed Bandits | —Unverified | 0 |
| Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments | Jun 17, 2025 | Atari GamesBoard Games | CodeCode Available | 0 |
| Common Benchmarks Undervalue the Generalization Power of Programmatic Policies | Jun 17, 2025 | Sequential Decision Making | CodeCode Available | 0 |