| LLM-Stackelberg Games: Conjectural Reasoning Equilibria and Their Applications to Spearphishing | Jul 12, 2025 | Decision MakingMisinformation | —Unverified | 0 |
| A Survey of Continual Reinforcement Learning | Jun 27, 2025 | Continual LearningDecision Making | —Unverified | 0 |
| Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | Jun 26, 2025 | Action GenerationDecision Making | —Unverified | 0 |
| POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes | Jun 25, 2025 | Sequential Decision Making | —Unverified | 0 |
| Efficient Strategy Synthesis for MDPs via Hierarchical Block Decomposition | Jun 21, 2025 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards | Jun 20, 2025 | Decision Making Under UncertaintyMulti-Armed Bandits | —Unverified | 0 |
| UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making | Jun 20, 2025 | Decision MakingQuestion Answering | CodeCode Available | 0 |
| Common Benchmarks Undervalue the Generalization Power of Programmatic Policies | Jun 17, 2025 | Sequential Decision Making | CodeCode Available | 0 |
| Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments | Jun 17, 2025 | Atari GamesBoard Games | CodeCode Available | 0 |
| Leveraging In-Context Learning for Language Model Agents | Jun 16, 2025 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity | Jun 14, 2025 | Change DetectionClustering | —Unverified | 0 |
| TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning | Jun 11, 2025 | Deep Reinforcement LearningSequential Decision Making | CodeCode Available | 0 |
| Towards Responsible AI: Advances in Safety, Fairness, and Accountability of Autonomous Systems | Jun 11, 2025 | Autonomous VehiclesDecision Making | —Unverified | 0 |
| How to Provably Improve Return Conditioned Supervised Learning? | Jun 10, 2025 | Decision MakingOffline RL | —Unverified | 0 |
| QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine | Jun 8, 2025 | Decision MakingQuantization | —Unverified | 0 |
| Contextual Experience Replay for Self-Improvement of Language Agents | Jun 7, 2025 | Decision MakingLarge Language Model | —Unverified | 0 |
| AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization | Jun 5, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| TextAtari: 100K Frames Game Playing with Language Agents | Jun 4, 2025 | Atari GamesDecision Making | CodeCode Available | 0 |
| Emergent Risk Awareness in Rational Agents under Resource Constraints | May 29, 2025 | Sequential Decision Making | —Unverified | 0 |
| Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation | May 29, 2025 | Decision MakingHallucination | —Unverified | 0 |
| Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing | May 27, 2025 | Sequential Decision Making | —Unverified | 0 |
| Variational Deep Learning via Implicit Regularization | May 26, 2025 | Deep LearningInductive Bias | —Unverified | 0 |
| DDO: Dual-Decision Optimization via Multi-Agent Collaboration for LLM-Based Medical Consultation | May 24, 2025 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Automata Learning of Preferences over Temporal Logic Formulas from Pairwise Comparisons | May 23, 2025 | Motion PlanningSequential Decision Making | —Unverified | 0 |
| Reward Is Enough: LLMs Are In-Context Reinforcement Learners | May 21, 2025 | Large Language ModelReinforcement Learning (RL) | —Unverified | 0 |