SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 10511075 of 1262 papers

TitleStatusHype
Balanced Linear Contextual Bandits0
ADARES: Adaptive Resource Management for Virtual Machines0
Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward0
Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task0
A Bandit Approach to Sequential Experimental Design with False Discovery Control0
Stochastic Top-K Subset Bandits with Linear Space and Non-Linear Feedback0
Adversarial Bandits with Knapsacks0
Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration0
Rotting bandits are not harder than stochastic ones0
Bandits with Temporal Stochastic Constraints0
Decentralized Exploration in Multi-Armed Bandits -- Extended version0
Best Arm Identification in Linked Bandits0
Sample complexity of partition identification using multi-armed bandits0
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits0
Adapting multi-armed bandits policies to contextual bandits scenariosCode0
Practical Bayesian Learning of Neural Networks via Adaptive Optimisation MethodsCode0
Multi-armed Bandits with Compensation0
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With RenegingCode0
Online learning with feedback graphs and switching costs0
Simple Regret Minimization for Contextual Bandits0
Regularized Contextual Bandits0
Fighting Contextual Bandits with Stochastic Smoothing0
Decentralized Cooperative Stochastic BanditsCode0
Thompson Sampling Algorithms for Cascading Bandits0
Contextual Multi-Armed Bandits for Causal Marketing0
Show:102550
← PrevPage 43 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified