SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 871880 of 1262 papers

TitleStatusHype
Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme0
BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed BanditsCode1
Bandits with Partially Observable Confounded Data0
TS-UCB: Improving on Thompson Sampling With Little to No Additional Computation0
Efficient Contextual Bandits with Continuous ActionsCode1
Gaussian Gated Linear NetworksCode0
Distributionally Robust Batch Contextual Bandits0
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition0
Online Learning in Iterated Prisoner's Dilemma to Mimic Human BehaviorCode0
Meta-Learning Bandit Policies by Gradient Ascent0
Show:102550
← PrevPage 88 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified