SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 891900 of 1262 papers

TitleStatusHype
Sequential Design for Ranking Response Surfaces0
Sequential Monte Carlo Bandits0
Settling the Communication Complexity for Distributed Offline Reinforcement Learning0
SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification of Top-k Features0
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF0
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms0
Shuffle Private Linear Contextual Bandits0
Simple Regret Minimization for Contextual Bandits0
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition0
Skyline Identification in Multi-Armed Bandits0
Show:102550
← PrevPage 90 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified