SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 941950 of 1262 papers

TitleStatusHype
Fair Contextual Multi-Armed Bandits: Theory and Experiments0
Sublinear Optimal Policy Value Estimation in Contextual Bandits0
Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric0
Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits0
Thompson Sampling for Multinomial Logit Contextual BanditsCode0
Offline Contextual Bandits with High Probability Fairness GuaranteesCode0
Learning in Generalized Linear Contextual Bandits with Stochastic Delays0
Surrogate Objectives for Batch Policy Optimization in One-step Decision Making0
Contextual Combinatorial Conservative Bandits0
Automatic Ensemble Learning for Online Influence Maximization0
Show:102550
← PrevPage 95 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified