SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 901925 of 1262 papers

TitleStatusHype
Self-Supervised Contextual Bandits in Computer Vision0
Learning and Fairness in Energy Harvesting: A Maximin Multi-Armed Bandits Approach0
Delay-Adaptive Learning in Generalized Linear Contextual Bandits0
Convex Hull Monte-Carlo Tree Search0
Online Residential Demand Response via Contextual Multi-Armed Bandits0
A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option0
Stochastic Linear Contextual Bandits with Diverse Contexts0
Robustness Guarantees for Mode Estimation with an Application to Bandits0
Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits0
Taking a hint: How to leverage loss predictors in contextual bandits?0
Model Selection in Contextual Stochastic Bandit Problems0
Bounded Regret for Finitely Parameterized Multi-Armed Bandits0
Distributed Cooperative Decision Making in Multi-agent Multi-armed Bandits0
Decentralized Multi-player Multi-armed Bandits with No Collision Information0
Designing Truthful Contextual Multi-Armed Bandits based Sponsored Search Auctions0
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis0
Bandit Learning with Delayed Impact of Actions0
The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many ArmsCode0
Survey Bandits with Regret Guarantees0
Online Learning in Contextual Bandits using Gated Linear Networks0
Residual Bootstrap Exploration for Bandit Algorithms0
On conditional versus marginal bias in multi-armed bandits0
Adaptive Estimator Selection for Off-Policy EvaluationCode0
Coordination without communication: optimal regret in two players multi-armed bandits0
Tight Lower Bounds for Combinatorial Multi-Armed Bandits0
Show:102550
← PrevPage 37 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified