SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 981990 of 1262 papers

TitleStatusHype
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits0
Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima0
Optimal cross-learning for contextual bandits with unknown context distributions0
Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity0
Optimal Learning for Sequential Decision Making for Expensive Cost Functions with Stochastic Binary Feedbacks0
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective0
Optimal Multi-Objective Best Arm Identification with Fixed Confidence0
Optimal No-regret Learning in Repeated First-price Auctions0
Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits0
Optimal Streaming Algorithms for Multi-Armed Bandits0
Show:102550
← PrevPage 99 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified