SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 12111220 of 1262 papers

TitleStatusHype
Truncated LinUCB for Stochastic Linear BanditsCode0
Adaptive Estimator Selection for Off-Policy EvaluationCode0
Practical Bayesian Learning of Neural Networks via Adaptive Optimisation MethodsCode0
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis PredictionCode0
Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and GeneralizationCode0
A Survey on Contextual Multi-armed BanditsCode0
Practical Calculation of Gittins Indices for Multi-armed BanditsCode0
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With RenegingCode0
A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed BanditsCode0
Hierarchical Multi-Armed Bandits for the Concurrent Intelligent Tutoring of Concepts and Problems of Varying Difficulty LevelsCode0
Show:102550
← PrevPage 122 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified