SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 901910 of 1262 papers

TitleStatusHype
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs0
Small-loss bounds for online learning with partial information0
Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness0
SmartChoices: Augmenting Software with Learned Implementations0
Smoothed Online Learning is as Easy as Statistical Learning0
Smooth Sequential Optimisation with Delayed Feedback0
Social Learning in Multi Agent Multi Armed Bandits0
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates0
Sparse Nonparametric Contextual Bandits0
Sparsity, variance and curvature in multi-armed bandits0
Show:102550
← PrevPage 91 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified