SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 961–970 of 1262 papers

Title	Date	Tasks	Status
On Speeding Up Language Model Evaluation	Jul 8, 2024	Language Model EvaluationLanguage Modeling	—Unverified
On Submodular Contextual Bandits	Dec 3, 2021	Multi-Armed Bandits	—Unverified
On the bias, risk and consistency of sample means in multi-armed bandits	Feb 2, 2019	Multi-Armed BanditsSelection bias	—Unverified
On the Complexity of Representation Learning in Contextual Linear Bandits	Dec 19, 2022	Model SelectionMulti-Armed Bandits	—Unverified
On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits	Jul 20, 2016	Decision MakingMulti-Armed Bandits	—Unverified
On the Importance of Uncertainty in Decision-Making with Large Language Models	Apr 3, 2024	Decision MakingMulti-Armed Bandits	—Unverified
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits	Mar 16, 2023	Multi-Armed Bandits	—Unverified
Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits	Oct 16, 2021	Multi-Armed Bandits	—Unverified
On the Problem of Best Arm Retention	Apr 16, 2025	Multi-Armed Bandits	—Unverified
Contextual Decision-Making with Knapsacks Beyond the Worst Case	Nov 25, 2022	Decision MakingManagement	—Unverified

Show:10 25 50

← PrevPage 97 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified