SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 971980 of 1262 papers

TitleStatusHype
Tight Regret Bounds for Infinite-armed Linear Contextual Bandits0
Top-K Ranking Deep Contextual Bandits for Information Selection Systems0
To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation0
Towards Distribution-Free Multi-Armed Bandits with Combinatorial Strategies0
Towards Domain Adaptive Neural Contextual Bandits0
Towards More Efficient, Robust, Instance-adaptive, and Generalizable Sequential Decision making0
Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information0
Towards Robust Off-Policy Evaluation via Human Inputs0
Towards Soft Fairness in Restless Multi-Armed Bandits0
Towards Understanding the Benefit of Multitask Representation Learning in Decision Process0
Show:102550
← PrevPage 98 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified