SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 761770 of 1262 papers

TitleStatusHype
Learning Multiple Tasks in Parallel with a Shared Annotator0
Learning Personalized Decision Support Policies0
Learning to Actively Learn: A Robust Approach0
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems0
Learning to Explore with Lagrangians for Bandits under Unknown Linear Constraints0
Learning to Optimize Energy Efficiency in Energy Harvesting Wireless Sensor Networks0
Learning to Rank in the Position Based Model with Bandit Feedback0
Learning to Search Better Than Your Teacher0
Learning to Use Learners' Advice0
Lenient Regret for Multi-Armed Bandits0
Show:102550
← PrevPage 77 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified