Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems Jul 24, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Contextual Bandits and Imitation Learning via Preference-Based Active Queries Jul 24, 2023 Imitation Learning Multi-Armed Bandits
— Unverified 0Preferences Evolve And So Should Your Bandits: Bandits with Evolving States for Online Platforms Jul 21, 2023 Multi-Armed Bandits Recommendation Systems
— Unverified 0Decentralized Smart Charging of Large-Scale EVs using Adaptive Multi-Agent Multi-Armed Bandits Jul 20, 2023 Fairness Multi-Armed Bandits
— Unverified 0VITS : Variational Inference Thompson Sampling for contextual bandits Jul 19, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 0Adaptive Linear Estimating Equations Jul 14, 2023 Multi-Armed Bandits
Code Code Available 0On Interpolating Experts and Multi-Armed Bandits Jul 14, 2023 Multi-Armed Bandits
— Unverified 0Tracking Most Significant Shifts in Nonparametric Contextual Bandits Jul 11, 2023 Multi-Armed Bandits
— Unverified 0SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification of Top-k Features Jul 10, 2023 Feature Importance Multi-Armed Bandits
— Unverified 0BOF-UCB: A Bayesian-Optimistic Frequentist Algorithm for Non-Stationary Contextual Bandits Jul 7, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization Jul 5, 2023 Multi-Armed Bandits
— Unverified 0Meta-Learning Adversarial Bandit Algorithms Jul 5, 2023 Meta-Learning Multi-Armed Bandits
— Unverified 0Thompson sampling for improved exploration in GFlowNets Jun 30, 2023 Active Learning Decision Making
— Unverified 0Kernel ε-Greedy for Multi-Armed Bandits with Covariates Jun 29, 2023 Multi-Armed Bandits
— Unverified 0Pure exploration in multi-armed bandits with low rank structure using oblivious sampler Jun 28, 2023 Multi-Armed Bandits
— Unverified 0You Can Trade Your Experience in Distributed Multi-Agent Multi-Armed Bandits Jun 19, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning Jun 15, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Multi-Fidelity Multi-Armed Bandits Revisited Jun 13, 2023 Multi-Armed Bandits
— Unverified 0Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual Bandits Jun 13, 2023 Multi-Armed Bandits
— Unverified 0Budgeted Multi-Armed Bandits with Asymmetric Confidence Intervals Jun 12, 2023 Multi-Armed Bandits
Code Code Available 0Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity Jun 9, 2023 Multi-Armed Bandits regression
— Unverified 0Federated Linear Contextual Bandits with User-level Differential Privacy Jun 8, 2023 Decision Making Multi-Armed Bandits
— Unverified 0Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits Jun 3, 2023 Multi-Armed Bandits Open-Ended Question Answering
Code Code Available 0Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards Jun 1, 2023 Multi-Armed Bandits reinforcement-learning
— Unverified 0Representation-Driven Reinforcement Learning May 31, 2023 Multi-Armed Bandits reinforcement-learning
— Unverified 0Competing for Shareable Arms in Multi-Player Multi-Armed Bandits May 30, 2023 Multi-Armed Bandits
Code Code Available 1Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits May 30, 2023 Multi-Armed Bandits
— Unverified 0Contextual Bandits with Budgeted Information Reveal May 29, 2023 Multi-Armed Bandits
— Unverified 0Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness May 25, 2023 Fairness Multi-Armed Bandits
— Unverified 0Meta-in-context learning in large language models May 22, 2023 In-Context Learning Multi-Armed Bandits
Code Code Available 0Sequential Best-Arm Identification with Application to Brain-Computer Interface May 17, 2023 Brain Computer Interface EEG
— Unverified 0Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits May 11, 2023 Multi-Armed Bandits
Code Code Available 1Efficient Training of Multi-task Combinarotial Neural Solver with Multi-armed Bandits May 10, 2023 Combinatorial Optimization Decoder
— Unverified 0Neural Exploitation and Exploration of Contextual Bandits May 5, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Reward Teaching for Federated Multi-armed Bandits May 3, 2023 Multi-Armed Bandits
— Unverified 0Stochastic Contextual Bandits with Graph-based Contexts May 2, 2023 Multi-Armed Bandits
— Unverified 0First- and Second-Order Bounds for Adversarial Linear Contextual Bandits May 1, 2023 Multi-Armed Bandits
— Unverified 0Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards Apr 28, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 0Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning Apr 26, 2023 Multi-Armed Bandits reinforcement-learning
Code Code Available 0Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards Apr 26, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Optimal Activation of Halting Multi-Armed Bandit Models Apr 20, 2023 Multi-Armed Bandits
— Unverified 0A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits Apr 16, 2023 Multi-Armed Bandits Recommendation Systems
Code Code Available 0Learning Personalized Decision Support Policies Apr 13, 2023 Language Modelling Large Language Model
— Unverified 0SmartChoices: Augmenting Software with Learned Implementations Apr 12, 2023 Multi-Armed Bandits Philosophy
— Unverified 0BanditQ: Fair Bandits with Guaranteed Rewards Apr 11, 2023 Multi-Armed Bandits
— Unverified 0Full Gradient Deep Reinforcement Learning for Average-Reward Criterion Apr 7, 2023 Deep Reinforcement Learning Multi-Armed Bandits
— Unverified 0Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms Apr 6, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0Federated Learning for Heterogeneous Bandits with Unobserved Contexts Mar 29, 2023 Federated Learning Multi-Armed Bandits
— Unverified 0Adaptive Endpointing with Deep Contextual Multi-armed Bandits Mar 23, 2023 Multi-Armed Bandits
— Unverified 0An Empirical Evaluation of Federated Contextual Bandit Algorithms Mar 17, 2023 Federated Learning Multi-Armed Bandits
— Unverified 0