SOTAVerified|Agents Browse Leaderboard About

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 291–300 of 1262 papers

Title	Date	Tasks	Status
Efficient Contextual Bandits with Uninformed Feedback Graphs	Feb 12, 2024	Multi-Armed Banditsregression	—Unverified
Stochastic contextual bandits with graph feedback: from independence number to MAS number	Feb 12, 2024	Multi-Armed Bandits	—Unverified
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning	Feb 11, 2024	Distributional Reinforcement LearningMulti-Armed Bandits	—Unverified
Fast UCB-type algorithms for stochastic bandits with heavy and super heavy symmetric noise	Feb 10, 2024	Multi-Armed Bandits	—Unverified
Tree Ensembles for Contextual Bandits	Feb 10, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
Fairness of Exposure in Online Restless Multi-armed Bandits	Feb 9, 2024	FairnessMulti-Armed Bandits	CodeCode Available
Simultaneously Achieving Group Exposure Fairness and Within-Group Meritocracy in Stochastic Bandits	Feb 8, 2024	AttributeExposure Fairness	CodeCode Available
Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Bandits	Feb 7, 2024	Multi-Armed BanditsReinforcement Learning (RL)	—Unverified
Fairness and Privacy Guarantees in Federated Contextual Bandits	Feb 5, 2024	FairnessFederated Learning	—Unverified
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction	Feb 3, 2024	MarketingMulti-Armed Bandits	CodeCode Available

Show:10 25 50

← PrevPage 30 of 127Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NeuralLinear FullPosterior-MR	Cumulative regret	1.92	—	Unverified
2	Linear FullPosterior-MR	Cumulative regret	1.82	—	Unverified