SOTAVerified|Agents Browse Leaderboard About Blog

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 121–130 of 1262 papers

Title	Date	Tasks	Status	Score
Causal Contextual Bandits with Adaptive Context	May 28, 2024	Multi-Armed Bandits	CodeCode Available	5
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards	Apr 28, 2023	Multi-Armed BanditsThompson Sampling	CodeCode Available	5
A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit	Oct 2, 2015	Decision MakingMulti-Armed Bandits	CodeCode Available	5
Latent Bottlenecked Attentive Neural Processes	Nov 15, 2022	Meta-LearningMulti-Armed Bandits	CodeCode Available	5
A Survey on Contextual Multi-armed Bandits	Aug 13, 2015	Multi-Armed BanditsSurvey	CodeCode Available	5
Learning Structural Weight Uncertainty for Sequential Decision-Making	Dec 30, 2017	Decision MakingMulti-Armed Bandits	CodeCode Available	5
Cascading Bandits for Large-Scale Recommendation Problems	Mar 17, 2016	Multi-Armed BanditsRecommendation Systems	CodeCode Available	5
Causally Abstracted Multi-armed Bandits	Apr 26, 2024	Decision MakingMulti-Armed Bandits	CodeCode Available	5
Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments	Jun 17, 2025	Atari GamesBoard Games	CodeCode Available	5
A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits	Apr 16, 2023	Multi-Armed BanditsRecommendation Systems	CodeCode Available	5

Show:10 25 50

← PrevPage 13 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified