| Bayesian learning of the optimal action-value function in a Markov decision process | May 3, 2025 | Decision MakingSequential Decision Making | —Unverified | 0 | 0 |
| Doubly Robust Off-policy Value Evaluation for Reinforcement Learning | Nov 11, 2015 | Decision Makingreinforcement-learning | —Unverified | 0 | 0 |
| Bayesian Inverse Transition Learning for Offline Settings | Aug 9, 2023 | Decision MakingSequential Decision Making | —Unverified | 0 | 0 |
| Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits | Oct 23, 2021 | Decision MakingMulti-Armed Bandits | —Unverified | 0 | 0 |
| Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds | Mar 1, 2024 | Decision MakingMulti-Armed Bandits | —Unverified | 0 | 0 |
| A Contextual Bandit Approach for Stream-Based Active Learning | Jan 24, 2017 | Active LearningDecision Making | —Unverified | 0 | 0 |
| DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback | Oct 7, 2024 | Multi-Armed BanditsSequential Decision Making | —Unverified | 0 | 0 |
| Don't Watch Me: A Spatio-Temporal Trojan Attack on Deep-Reinforcement-Learning-Augment Autonomous Driving | Nov 22, 2022 | Autonomous DrivingDecision Making | —Unverified | 0 | 0 |
| Bayesian Graph Traversal | Mar 7, 2025 | Decision MakingSequential Decision Making | —Unverified | 0 | 0 |
| Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation | Apr 13, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 | 0 |