SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 671–680 of 1262 papers

Title	Date	Tasks	Status
Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection	Oct 8, 2021	Multi-Armed Bandits	—Unverified
A Model Selection Approach for Corruption Robust Reinforcement Learning	Oct 7, 2021	Model SelectionMulti-Armed Bandits	—Unverified
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning	Oct 2, 2021	Multi-Armed Banditsregression	—Unverified
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits	Oct 1, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
Batched Thompson Sampling	Oct 1, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
Adapting Bandit Algorithms for Settings with Sequentially Available Arms	Sep 30, 2021	ManagementMulti-Armed Bandits	—Unverified
Regularized-OFU: an efficient algorithm for general contextual bandit with optimization oracles	Sep 29, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
Causal Contextual Bandits with Targeted Interventions	Sep 29, 2021	Multi-Armed Bandits	—Unverified
Expected Improvement-based Contextual Bandits	Sep 29, 2021	Bayesian OptimizationMulti-Armed Bandits	—Unverified
Batched Bandits with Crowd Externalities	Sep 29, 2021	Multi-Armed Bandits	—Unverified

Show:10 25 50

← PrevPage 68 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified