SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 811820 of 1262 papers

TitleStatusHype
Modeling Human Decision-making in Generalized Gaussian Multi-armed Bandits0
Modelling Cournot Games as Multi-agent Multi-armed Bandits0
Model selection for behavioral learning data and applications to contextual bandits0
Model Selection for Generic Contextual Bandits0
Model Selection in Contextual Stochastic Bandit Problems0
Model Selection in Reinforcement Learning with General Function Approximations0
Modified Meta-Thompson Sampling for Linear Bandits and Its Bayes Regret Analysis0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
More Robust Doubly Robust Off-policy Evaluation0
Mortal Multi-Armed Bandits0
Show:102550
← PrevPage 82 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified