SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 721–730 of 1262 papers

Title	Date	Tasks	Status
On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits	Jul 20, 2016	Decision MakingMulti-Armed Bandits	—Unverified
On the Importance of Uncertainty in Decision-Making with Large Language Models	Apr 3, 2024	Decision MakingMulti-Armed Bandits	—Unverified
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits	Mar 16, 2023	Multi-Armed Bandits	—Unverified
Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits	Oct 16, 2021	Multi-Armed Bandits	—Unverified
On the Problem of Best Arm Retention	Apr 16, 2025	Multi-Armed Bandits	—Unverified
Contextual Decision-Making with Knapsacks Beyond the Worst Case	Nov 25, 2022	Decision MakingManagement	—Unverified
On The Statistical Complexity of Offline Decision-Making	Jan 10, 2025	Decision MakingMulti-Armed Bandits	—Unverified
On Top-k Selection in Multi-Armed Bandits and Hidden Bipartite Graphs	Dec 1, 2015	Multi-Armed Bandits	—Unverified
On Universally Optimal Algorithms for A/B Testing	Aug 23, 2023	Multi-Armed Bandits	—Unverified
Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture	May 27, 2016	Multi-Armed Bandits	—Unverified

Show:10 25 50

← PrevPage 73 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified