Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling Mar 16, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0On Minimax Optimal Offline Policy Evaluation Sep 12, 2014 Multi-Armed Bandits Off-policy evaluation
— Unverified 0On No-Sensing Adversarial Multi-player Multi-armed Bandits with Collision Communications Nov 2, 2020 Multi-Armed Bandits
— Unverified 0Towards Tractable Optimism in Model-Based Reinforcement Learning Jun 21, 2020 continuous-control Continuous Control
— Unverified 0On Penalization in Stochastic Multi-armed Bandits Nov 15, 2022 Fairness Multi-Armed Bandits
— Unverified 0On Private and Robust Bandits Feb 6, 2023 Multi-Armed Bandits
— Unverified 0On Quantum Natural Policy Gradients Jan 16, 2024 Multi-Armed Bandits reinforcement-learning
— Unverified 0On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits Nov 30, 2022 Multi-Armed Bandits
— Unverified 0On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits May 4, 2015 Multi-Armed Bandits
— Unverified 0On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits Sep 8, 2016 Multi-Armed Bandits
— Unverified 0On Speeding Up Language Model Evaluation Jul 8, 2024 Language Model Evaluation Language Modeling
— Unverified 0On Submodular Contextual Bandits Dec 3, 2021 Multi-Armed Bandits
— Unverified 0On the bias, risk and consistency of sample means in multi-armed bandits Feb 2, 2019 Multi-Armed Bandits Selection bias
— Unverified 0On the Complexity of Representation Learning in Contextual Linear Bandits Dec 19, 2022 Model Selection Multi-Armed Bandits
— Unverified 0On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits Jul 20, 2016 Decision Making Multi-Armed Bandits
— Unverified 0On the Importance of Uncertainty in Decision-Making with Large Language Models Apr 3, 2024 Decision Making Multi-Armed Bandits
— Unverified 0On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits Mar 16, 2023 Multi-Armed Bandits
— Unverified 0Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits Oct 16, 2021 Multi-Armed Bandits
— Unverified 0On the Problem of Best Arm Retention Apr 16, 2025 Multi-Armed Bandits
— Unverified 0Contextual Decision-Making with Knapsacks Beyond the Worst Case Nov 25, 2022 Decision Making Management
— Unverified 0On The Statistical Complexity of Offline Decision-Making Jan 10, 2025 Decision Making Multi-Armed Bandits
— Unverified 0On Top-k Selection in Multi-Armed Bandits and Hidden Bipartite Graphs Dec 1, 2015 Multi-Armed Bandits
— Unverified 0On Universally Optimal Algorithms for A/B Testing Aug 23, 2023 Multi-Armed Bandits
— Unverified 0Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture May 27, 2016 Multi-Armed Bandits
— Unverified 0Open Problem: Model Selection for Contextual Bandits Jun 19, 2020 model Model Selection
— Unverified 0Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards Jul 8, 2024 Multi-Armed Bandits
— Unverified 0Optimal Activation of Halting Multi-Armed Bandit Models Apr 20, 2023 Multi-Armed Bandits
— Unverified 0Optimal Algorithms for Range Searching over Multi-Armed Bandits May 4, 2021 Multi-Armed Bandits
— Unverified 0Optimal Algorithms for Stochastic Contextual Preference Bandits Dec 1, 2021 Decision Making Information Retrieval
— Unverified 0Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards Oct 24, 2020 Multi-Armed Bandits
— Unverified 0Optimal and Adaptive Off-policy Evaluation in Contextual Bandits Dec 4, 2016 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima May 21, 2025 Multi-Armed Bandits
— Unverified 0Optimal cross-learning for contextual bandits with unknown context distributions Jan 3, 2024 Multi-Armed Bandits
— Unverified 0Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity Jun 9, 2023 Multi-Armed Bandits regression
— Unverified 0Optimal Learning for Sequential Decision Making for Expensive Cost Functions with Stochastic Binary Feedbacks Sep 13, 2017 Decision Making Multi-Armed Bandits
— Unverified 0Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective Jun 11, 2021 Model Selection Multi-Armed Bandits
— Unverified 0Optimal Multi-Objective Best Arm Identification with Fixed Confidence Jan 23, 2025 Multi-Armed Bandits
— Unverified 0Optimal No-regret Learning in Repeated First-price Auctions Mar 22, 2020 Multi-Armed Bandits Thompson Sampling
— Unverified 0Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits Jun 4, 2021 Multi-Armed Bandits
— Unverified 0Optimal Streaming Algorithms for Multi-Armed Bandits Oct 23, 2024 Multi-Armed Bandits
— Unverified 0Optimistic Information Directed Sampling Feb 23, 2024 Multi-Armed Bandits
— Unverified 0Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits Sep 30, 2024 Computational Efficiency Multi-Armed Bandits
— Unverified 0Optimizing Online Advertising with Multi-Armed Bandits: Mitigating the Cold Start Problem under Auction Dynamics Feb 3, 2025 Multi-Armed Bandits
— Unverified 0Optimizing Sharpe Ratio: Risk-Adjusted Decision-Making in Multi-Armed Bandits May 28, 2024 Decision Making Management
— Unverified 0Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual Bandits Jun 13, 2023 Multi-Armed Bandits
— Unverified 0OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits May 24, 2019 Multi-Armed Bandits
— Unverified 0PAC-Bayesian Analysis of Contextual Bandits Dec 1, 2011 Multi-Armed Bandits
— Unverified 0PAC-Bayesian Lifelong Learning For Multi-Armed Bandits Mar 7, 2022 Lifelong learning Multi-Armed Bandits
— Unverified 0PAC-Bayesian Offline Contextual Bandits With Guarantees Oct 24, 2022 Generalization Bounds Multi-Armed Bandits
— Unverified 0PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits Jan 24, 2019 Multi-Armed Bandits
— Unverified 0