SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 891900 of 1262 papers

TitleStatusHype
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits0
Neural Network Retraining for Model Serving0
Learning to Rank in the Position Based Model with Bandit Feedback0
Thompson Sampling for Linearly Constrained BanditsCode0
Sequential Batch Learning in Finite-Action Linear Contextual Bandits0
Power Constrained BanditsCode0
Exploration with Limited Memory: Streaming Algorithms for Coin Tossing, Noisy Comparisons, and Multi-Armed Bandits0
Hawkes Process Multi-armed Bandits for Disaster Search and Rescue0
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability0
Optimal No-regret Learning in Repeated First-price Auctions0
Show:102550
← PrevPage 90 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified