SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 891900 of 1262 papers

TitleStatusHype
NeuralUCB: Contextual Bandits with Neural Network-Based Exploration0
No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees0
Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits0
Nonparametric Contextual Bandits in an Unknown Metric Space0
Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric0
Nonparametric Stochastic Contextual Bandits0
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling0
Adversarial Rewards in Universal Learning for Contextual Bandits0
Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset0
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach0
Show:102550
← PrevPage 90 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified