SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–510 of 1262 papers

Title	Date	Tasks	Status
Finite-Horizon Single-Pull Restless Bandits: An Efficient Index Policy For Scarce Resource Allocation	Jan 10, 2025	Multi-Armed Bandits	—Unverified
Decision Making in Changing Environments: Robustness, Query-Based Learning, and Differential Privacy	Jan 24, 2025	Decision MakingMulti-Armed Bandits	—Unverified
Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health	Feb 2, 2022	Multi-Armed BanditsScheduling	—Unverified
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation	Oct 3, 2023	Multi-Armed BanditsQ-Learning	—Unverified
Batched Thompson Sampling for Multi-Armed Bandits	Aug 15, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
First- and Second-Order Bounds for Adversarial Linear Contextual Bandits	May 1, 2023	Multi-Armed Bandits	—Unverified
Fixed-Budget Best-Arm Identification in Structured Bandits	Jun 9, 2021	Multi-Armed Bandits	—Unverified
FLASH: Federated Learning Across Simultaneous Heterogeneities	Feb 13, 2024	Federated LearningMulti-Armed Bandits	—Unverified
Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles	Mar 30, 2022	Decision MakingHeterogeneous Treatment Effect Estimation	—Unverified
Decision Automation for Electric Power Network Recovery	Oct 1, 2019	Decision MakingMulti-Armed Bandits	—Unverified

Show:10 25 50

← PrevPage 51 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified