SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 12011210 of 1262 papers

TitleStatusHype
Policy Learning with Adaptively Collected DataCode0
Neural Contextual Bandits without RegretCode0
Meta-in-context learning in large language modelsCode0
Neural Contextual Bandits with UCB-based ExplorationCode0
Adaptive Experimentation with Delayed Binary FeedbackCode0
Group Meritocratic Fairness in Linear Contextual BanditsCode0
Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood MatchingCode0
Power Constrained BanditsCode0
Batched Multi-armed Bandits ProblemCode0
Harnessing the Power of Federated Learning in Federated Contextual BanditsCode0
Show:102550
← PrevPage 121 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified