SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 451500 of 1262 papers

TitleStatusHype
Bandit Social Learning: Exploration under Myopic Behavior0
Adversarial Rewards in Universal Learning for Contextual Bandits0
Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application to Joint Communications and SensingCode0
Leveraging User-Triggered Supervision in Contextual Bandits0
On Private and Robust Bandits0
Multiplier Bootstrap-based Exploration0
Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback0
Stochastic Contextual Bandits with Long Horizon Rewards0
Quantum contextual bandits and recommender systems for quantum data0
Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback0
Adversarial Attacks on Adversarial Bandits0
A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback0
Contextual Causal Bayesian Optimisation0
Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits0
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning0
Quantum Heavy-tailed Bandits0
Multi-Armed Bandits and Quantum Channel Oracles0
Multi-armed Bandit Learning for TDMA Transmission Slot Scheduling and Defragmentation for Improved Bandwidth Usage0
Best Arm Identification in Stochastic Bandits: Beyond β-optimality0
Local Differential Privacy for Sequential Decision Making in a Changing Environment0
Contextual Bandits and Optimistically Universal Learning0
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent0
On the Complexity of Representation Learning in Contextual Linear Bandits0
Faster Maximum Inner Product Search in High Dimensions0
MABSplit: Faster Forest Training Using Multi-Armed BanditsCode0
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes0
Networked Restless Bandits with Positive ExternalitiesCode0
Stochastic Rising BanditsCode0
AC-Band: A Combinatorial Bandit-Based Approach to Algorithm ConfigurationCode0
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits0
Incorporating Multi-armed Bandit with Local Search for MaxSATCode0
Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget0
Contextual Decision-Making with Knapsacks Beyond the Worst Case0
Transfer Learning for Contextual Multi-armed Bandits0
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning0
Causal Bandits: Online Decision-Making in Endogenous Settings0
Bandit Algorithms for Prophet Inequality and Pandora's Box0
On Penalization in Stochastic Multi-armed Bandits0
Latent Bottlenecked Attentive Neural ProcessesCode0
Multi-Player Bandits Robust to Adversarial Collisions0
Hypothesis Transfer in Bandits by Weighted Models0
Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression0
Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards0
Thompson Sampling for High-Dimensional Sparse Linear Contextual BanditsCode0
Safe and Adaptive Decision-Making for Optimization of Safety-Critical Systems: The ARTEO AlgorithmCode0
Adaptive Data Depth via Multi-Armed BanditsCode0
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms0
Revisiting Simple Regret: Fast Rates for Returning a Good Arm0
Robust Contextual Linear Bandits0
PAC-Bayesian Offline Contextual Bandits With Guarantees0
Show:102550
← PrevPage 10 of 26Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified