SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 11211130 of 1262 papers

TitleStatusHype
Fairness of Exposure in Online Restless Multi-armed BanditsCode0
Learning Contextual Bandits in a Non-stationary EnvironmentCode0
Falcon: Fair Active Learning using Multi-armed BanditsCode0
Optimistic Whittle Index Policy: Online Learning for Restless BanditsCode0
Quantum exploration algorithms for multi-armed banditsCode0
Contextual bandits with entropy-based human feedbackCode0
Fast Beam Alignment via Pure Exploration in Multi-armed BanditsCode0
Contextual Bandits with Large Action Spaces: Made PracticalCode0
VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement LearningCode0
Transfer Learning in Latent Contextual Bandits with Covariate Shift Through Causal TransportabilityCode0
Show:102550
← PrevPage 113 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified