SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 391400 of 1262 papers

TitleStatusHype
Be Greedy in Multi-Armed Bandits0
Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards0
Meta-Learning Bandit Policies by Gradient Ascent0
Doubly robust off-policy evaluation with shrinkage0
Beam Learning -- Using Machine Learning for Finding Beam Directions0
Doubly Robust Policy Evaluation and Optimization0
A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits0
Designing Truthful Contextual Multi-Armed Bandits based Sponsored Search Auctions0
Designing an Interpretable Interface for Contextual Bandits0
BEACON: Balancing Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes0
Show:102550
← PrevPage 40 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified