SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 661–670 of 1262 papers

Title	Date	Tasks	Status
The Pareto Frontier of model selection for general Contextual Bandits	Oct 25, 2021	Model SelectionMulti-Armed Bandits	—Unverified
Linear Contextual Bandits with Adversarial Corruptions	Oct 25, 2021	Multi-Armed Bandits	—Unverified
Towards the D-Optimal Online Experiment Design for Recommender Selection	Oct 23, 2021	Multi-Armed Bandits	CodeCode Available
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits	Oct 23, 2021	Decision MakingMulti-Armed Bandits	—Unverified
Dynamic pricing and assortment under a contextual MNL demand	Oct 19, 2021	Multi-Armed Bandits	—Unverified
Stateful Offline Contextual Policy Evaluation and Learning	Oct 19, 2021	ManagementMulti-Armed Bandits	—Unverified
Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits	Oct 16, 2021	Multi-Armed Bandits	—Unverified
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits	Oct 15, 2021	Multi-Armed Bandits	—Unverified
Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits	Oct 13, 2021	Machine TranslationMulti-Armed Bandits	—Unverified
Query-Reward Tradeoffs in Multi-Armed Bandits	Oct 12, 2021	Multi-Armed Bandits	—Unverified

Show:10 25 50

← PrevPage 67 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified