SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 12411250 of 1262 papers

TitleStatusHype
Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach (Extended Version)0
Contextual Bandits for Evaluating and Improving Inventory Control Policies0
Contextual Bandits for Unbounded Context Distributions0
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning0
Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning at Adyen0
Linear Bandits with Stochastic Delayed Feedback0
Contextual Bandits with Arm Request Costs and Delays0
Contextual Bandits with Budgeted Information Reveal0
Contextual bandits with concave rewards, and an application to fair ranking0
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting0
Show:102550
← PrevPage 125 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified