SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 10761100 of 1262 papers

TitleStatusHype
Contextual Bandits with Cross-learning0
SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed BanditsCode0
Multi-Player Bandits: A Trekking Approach0
Machine Teaching of Active Sequential LearnersCode0
Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits0
Data Poisoning Attacks in Contextual Bandits0
Correlated Multi-armed Bandits with a Latent Random SourceCode0
Nonparametric Gaussian Mixture Models for the Multi-Armed BanditCode0
On-line Adaptative Curriculum Learning for GANsCode0
Preference-based Online Learning with Dueling Bandits: A Survey0
Deep Contextual Multi-armed Bandits0
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits0
Linear Bandits with Stochastic Delayed Feedback0
Multi-User Multi-Armed Bandits for Uncoordinated Spectrum Access0
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems0
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms0
Greybox fuzzing as a contextual bandits problem0
Mitigating Bias in Adaptive Data Gathering via Differential Privacy0
Finding the bandit in a graph: Sequential search-and-stop0
A General Framework for Bandit Problems Beyond Cumulative Objectives0
The Externalities of Exploration and How Data Diversity Helps Exploitation0
Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic ProgrammingCode0
Learning Contextual Bandits in a Non-stationary EnvironmentCode0
Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits0
Bandit-Based Monte Carlo Optimization for Nearest NeighborsCode0
Show:102550
← PrevPage 44 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified