Federated Linear Contextual Bandits with User-level Differential Privacy Jun 8, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits Jun 3, 2023 Multi-Armed Bandits Open-Ended Question Answering
Code Code Available 0Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards Jun 1, 2023 Multi-Armed Bandits reinforcement-learning
— Unverified 0Representation-Driven Reinforcement Learning May 31, 2023 Multi-Armed Bandits reinforcement-learning
— Unverified 0Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits May 30, 2023 Multi-Armed Bandits
— Unverified 0Contextual Bandits with Budgeted Information Reveal May 29, 2023 Multi-Armed Bandits
— Unverified 0Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness May 25, 2023 Fairness Multi-Armed Bandits
— Unverified 0Meta-in-context learning in large language models May 22, 2023 In-Context Learning Multi-Armed Bandits
Code Code Available 0Sequential Best-Arm Identification with Application to Brain-Computer Interface May 17, 2023 Brain Computer Interface EEG
— Unverified 0Efficient Training of Multi-task Combinarotial Neural Solver with Multi-armed Bandits May 10, 2023 Combinatorial Optimization Decoder
— Unverified 0Reward Teaching for Federated Multi-armed Bandits May 3, 2023 Multi-Armed Bandits
— Unverified 0Stochastic Contextual Bandits with Graph-based Contexts May 2, 2023 Multi-Armed Bandits
— Unverified 0First- and Second-Order Bounds for Adversarial Linear Contextual Bandits May 1, 2023 Multi-Armed Bandits
— Unverified 0Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards Apr 28, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 0Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning Apr 26, 2023 Multi-Armed Bandits reinforcement-learning
Code Code Available 0Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards Apr 26, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Optimal Activation of Halting Multi-Armed Bandit Models Apr 20, 2023 Multi-Armed Bandits
— Unverified 0A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits Apr 16, 2023 Multi-Armed Bandits Recommendation Systems
Code Code Available 0Learning Personalized Decision Support Policies Apr 13, 2023 Language Modelling Large Language Model
— Unverified 0SmartChoices: Augmenting Software with Learned Implementations Apr 12, 2023 Multi-Armed Bandits Philosophy
— Unverified 0BanditQ: Fair Bandits with Guaranteed Rewards Apr 11, 2023 Multi-Armed Bandits
— Unverified 0Full Gradient Deep Reinforcement Learning for Average-Reward Criterion Apr 7, 2023 Deep Reinforcement Learning Multi-Armed Bandits
— Unverified 0Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms Apr 6, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Federated Learning for Heterogeneous Bandits with Unobserved Contexts Mar 29, 2023 Federated Learning Multi-Armed Bandits
— Unverified 0Adaptive Endpointing with Deep Contextual Multi-armed Bandits Mar 23, 2023 Multi-Armed Bandits
— Unverified 0An Empirical Evaluation of Federated Contextual Bandit Algorithms Mar 17, 2023 Federated Learning Multi-Armed Bandits
— Unverified 0On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits Mar 16, 2023 Multi-Armed Bandits
— Unverified 0Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling Mar 16, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Data Dependent Regret Guarantees Against General Comparators for Full or Bandit Feedback Mar 12, 2023 Multi-Armed Bandits
— Unverified 0Flooding with Absorption: An Efficient Protocol for Heterogeneous Bandits over Complex Networks Mar 9, 2023 Decision Making Multi-Armed Bandits
Code Code Available 0Queue Scheduling with Adversarial Bandit Learning Mar 3, 2023 Multi-Armed Bandits Scheduling
— Unverified 0Efficient Explorative Key-term Selection Strategies for Conversational Contextual Bandits Mar 1, 2023 Computational Efficiency Multi-Armed Bandits
Code Code Available 0Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks Mar 1, 2023 Fairness Multi-Armed Bandits
— Unverified 0Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards Mar 1, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Approximately Stationary Bandits with Knapsacks Feb 28, 2023 Multi-Armed Bandits
— Unverified 0The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models Feb 28, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms Feb 27, 2023 Multi-Armed Bandits
— Unverified 0On Differentially Private Federated Linear Contextual Bandits Feb 27, 2023 Multi-Armed Bandits
— Unverified 0Kernel Conditional Moment Constraints for Confounding Robust Inference Feb 26, 2023 Multi-Armed Bandits Sensitivity
Code Code Available 0Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits Feb 24, 2023 Multi-Armed Bandits Navigate
— Unverified 0Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments Feb 23, 2023 Multi-Armed Bandits regression
— Unverified 0Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency Feb 21, 2023 Computational Efficiency Decision Making
— Unverified 0A Blackbox Approach to Best of Both Worlds in Bandits and Beyond Feb 20, 2023 Multi-Armed Bandits
— Unverified 0Estimating Optimal Policy Value in General Linear Contextual Bandits Feb 19, 2023 Model Selection Multi-Armed Bandits
— Unverified 0Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits Feb 18, 2023 Hyperparameter Optimization Multi-Armed Bandits
— Unverified 0Improving Fairness in Adaptive Social Exergames via Shapley Bandits Feb 18, 2023 Fairness Multi-Armed Bandits
— Unverified 0Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond Feb 18, 2023 Multi-Armed Bandits
— Unverified 0Practical Contextual Bandits with Feedback Graphs Feb 17, 2023 Multi-Armed Bandits regression
— Unverified 0Infinite Action Contextual Bandits with Reusable Data Exhaust Feb 16, 2023 Model Selection Multi-Armed Bandits
Code Code Available 0Genetic multi-armed bandits: a reinforcement learning approach for discrete optimization via simulation Feb 15, 2023 Multi-Armed Bandits Stochastic Optimization
— Unverified 0