Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual Bandits Jun 13, 2023 Multi-Armed Bandits
— Unverified 00 OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits May 24, 2019 Multi-Armed Bandits
— Unverified 00 PAC-Bayesian Analysis of Contextual Bandits Dec 1, 2011 Multi-Armed Bandits
— Unverified 00 PAC-Bayesian Lifelong Learning For Multi-Armed Bandits Mar 7, 2022 Lifelong learning Multi-Armed Bandits
— Unverified 00 PAC-Bayesian Offline Contextual Bandits With Guarantees Oct 24, 2022 Generalization Bounds Multi-Armed Bandits
— Unverified 00 PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits Jan 24, 2019 Multi-Armed Bandits
— Unverified 00 PAC Reinforcement Learning with Rich Observations Feb 8, 2016 Decision Making Multi-Armed Bandits
— Unverified 00 Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy Jan 17, 2025 Multi-Armed Bandits
— Unverified 00 Parallel Contextual Bandits in Wireless Handover Optimization Jan 21, 2019 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Parallelizing Contextual Bandits May 21, 2021 Decision Making Decision Making Under Uncertainty
— Unverified 00 Parameterized Exploration Jul 13, 2019 Multi-Armed Bandits
— Unverified 00 Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback Sep 16, 2020 Multi-Armed Bandits Recommendation Systems
— Unverified 00 Partially Observable Contextual Bandits with Linear Payoffs Sep 17, 2024 Decision Making Multi-Armed Bandits
— Unverified 00 Personalization Paradox in Behavior Change Apps: Lessons from a Social Comparison-Based Personalized App for Physical Activity Jan 25, 2021 Multi-Armed Bandits
— Unverified 00 Personalized Course Sequence Recommendations Dec 30, 2015 Multi-Armed Bandits
— Unverified 00 Perturbed-History Exploration in Stochastic Multi-Armed Bandits Feb 26, 2019 Multi-Armed Bandits
— Unverified 00 Pessimism for Offline Linear Contextual Bandits using _p Confidence Sets May 21, 2022 Multi-Armed Bandits
— Unverified 00 PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits May 18, 2018 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Phasic Diversity Optimization for Population-Based Reinforcement Learning Mar 17, 2024 Diversity MuJoCo
— Unverified 00 Non-Stationary Off-Policy Optimization Jun 15, 2020 Multi-Armed Bandits
— Unverified 00 Player Modeling via Multi-Armed Bandits Feb 10, 2021 Multi-Armed Bandits
— Unverified 00 Policy Gradients for Contextual Recommendations Feb 12, 2018 Decision Making Multi-Armed Bandits
— Unverified 00 Practical Algorithms for Best-K Identification in Multi-Armed Bandits May 19, 2017 Multi-Armed Bandits
— Unverified 00 Practical Contextual Bandits with Regression Oracles Mar 3, 2018 General Classification Multi-Armed Bandits
— Unverified 00 Preference-based Online Learning with Dueling Bandits: A Survey Jul 30, 2018 Multi-Armed Bandits Survey
— Unverified 00 Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms Apr 29, 2025 Multi-Armed Bandits Navigate
— Unverified 00 Privacy Amplification via Shuffling for Linear Contextual Bandits Dec 11, 2021 Multi-Armed Bandits
— Unverified 00 Privacy-Preserving Communication-Efficient Federated Multi-Armed Bandits Nov 2, 2021 Decision Making Multi-Armed Bandits
— Unverified 00 Privacy-Preserving Multi-Party Contextual Bandits Oct 11, 2019 Multi-Armed Bandits Privacy Preserving
— Unverified 00 Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs Nov 3, 2019 Multi-Armed Bandits reinforcement-learning
— Unverified 00 Productization Challenges of Contextual Multi-Armed Bandits Jul 10, 2019 Multi-Armed Bandits
— Unverified 00 Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization Jul 5, 2023 Multi-Armed Bandits
— Unverified 00 Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems Jul 24, 2023 Decision Making Multi-Armed Bandits
— Unverified 00 Provable General Function Class Representation Learning in Multitask Bandits and MDPs May 31, 2022 Multi-Armed Bandits Reinforcement Learning (RL)
— Unverified 00 Provably and Practically Efficient Neural Contextual Bandits May 31, 2022 Multi-Armed Bandits
— Unverified 00 Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks Nov 22, 2023 Multi-Armed Bandits
— Unverified 00 Transfer Learning with Partially Observable Offline Data via Causal Bounds Aug 7, 2023 Multi-Armed Bandits Transfer Learning
— Unverified 00 Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback May 2, 2024 Multi-Armed Bandits Sequential Decision Making
— Unverified 00 Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits Feb 11, 2025 Computational Efficiency Multi-Armed Bandits
— Unverified 00 Provably Optimal Algorithms for Generalized Linear Contextual Bandits Feb 28, 2017 Multi-Armed Bandits News Recommendation
— Unverified 00 Pure Exploration in Asynchronous Federated Bandits Oct 17, 2023 Multi-Armed Bandits
— Unverified 00 Pure exploration in multi-armed bandits with low rank structure using oblivious sampler Jun 28, 2023 Multi-Armed Bandits
— Unverified 00 Combinatorial Pure Exploration of Causal Bandits Jun 16, 2022 Causal Inference Multi-Armed Bandits
— Unverified 00 Pure Exploration under Mediators' Feedback Aug 29, 2023 Decision Making Multi-Armed Bandits
— Unverified 00 QoS-Aware Multi-Armed Bandits Feb 28, 2017 Decision Making Multi-Armed Bandits
— Unverified 00 Quantile Multi-Armed Bandits with 1-bit Feedback Feb 10, 2025 Multi-Armed Bandits
— Unverified 00 Quantum contextual bandits and recommender systems for quantum data Jan 31, 2023 Multi-Armed Bandits Recommendation Systems
— Unverified 00 Quantum Heavy-tailed Bandits Jan 23, 2023 Multi-Armed Bandits
— Unverified 00 Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets May 30, 2022 Multi-Armed Bandits reinforcement-learning
— Unverified 00 Query-Efficient Correlation Clustering with Noisy Oracle Feb 2, 2024 Clustering Multi-Armed Bandits
— Unverified 00