SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 261270 of 1262 papers

TitleStatusHype
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning0
A Contextual Combinatorial Bandit Approach to Negotiation0
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond0
Contextual Bandits for Evaluating and Improving Inventory Control Policies0
Contextual Bandits and Optimistically Universal Learning0
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis0
Contextual Bandits and Imitation Learning via Preference-Based Active Queries0
Contextual Bandit Applications in Customer Support Bot0
Asymptotic Randomised Control with applications to bandits0
A Federated Online Restless Bandit Framework for Cooperative Resource Allocation0
Show:102550
← PrevPage 27 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified