SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 451500 of 1262 papers

TitleStatusHype
Networked Restless Bandits with Positive ExternalitiesCode0
Stochastic Rising BanditsCode0
AC-Band: A Combinatorial Bandit-Based Approach to Algorithm ConfigurationCode0
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits0
Incorporating Multi-armed Bandit with Local Search for MaxSATCode0
Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget0
Contextual Decision-Making with Knapsacks Beyond the Worst Case0
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning0
Transfer Learning for Contextual Multi-armed Bandits0
Causal Bandits: Online Decision-Making in Endogenous Settings0
Bandit Algorithms for Prophet Inequality and Pandora's Box0
Latent Bottlenecked Attentive Neural ProcessesCode0
On Penalization in Stochastic Multi-armed Bandits0
Multi-Player Bandits Robust to Adversarial Collisions0
Hypothesis Transfer in Bandits by Weighted Models0
Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression0
Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards0
Thompson Sampling for High-Dimensional Sparse Linear Contextual BanditsCode0
Safe and Adaptive Decision-Making for Optimization of Safety-Critical Systems: The ARTEO AlgorithmCode0
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms0
Adaptive Data Depth via Multi-Armed BanditsCode0
Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless BanditsCode1
Revisiting Simple Regret: Fast Rates for Returning a Good Arm0
Robust Contextual Linear Bandits0
Conditionally Risk-Averse Contextual BanditsCode0
Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous ActionsCode0
PAC-Bayesian Offline Contextual Bandits With Guarantees0
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees0
Fast Beam Alignment via Pure Exploration in Multi-armed BanditsCode0
Optimal Contextual Bandits with Knapsacks under Realizability via Regression OraclesCode0
Vertical Federated Linear Contextual Bandits0
Anytime-valid off-policy inference for contextual banditsCode1
Contextual bandits with concave rewards, and an application to fair ranking0
Multi-agent Dynamic Algorithm ConfigurationCode1
Simulated Contextual Bandits for Personalization Tasks from Recommendation DatasetsCode0
Maximum entropy exploration in contextual bandits with neural networks and energy based models0
Constant regret for sequence prediction with limited advice0
ProtoBandit: Efficient Prototype Selection via Multi-Armed Bandits0
Replicable Bandits0
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs0
On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits0
Off-Policy Risk Assessment in Markov Decision Processes0
Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits0
Towards Robust Off-Policy Evaluation via Human Inputs0
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems0
Risk-aware linear bandits with convex loss0
Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits0
Risk-Averse Multi-Armed Bandits with Unobserved Confounders: A Case Study in Emotion Regulation in Mobile Health0
When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits0
Multi-Armed Bandits with Self-Information Rewards0
Show:102550
← PrevPage 10 of 26Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified