SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 651655 of 655 papers

TitleStatusHype
Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization0
Learning to Optimize Via Posterior Sampling0
Thompson Sampling for Contextual Bandits with Linear PayoffsCode0
Thompson Sampling: An Asymptotically Optimal Finite Time AnalysisCode0
An Empirical Evaluation of Thompson Sampling0
Show:102550
← PrevPage 14 of 14Next →

No leaderboard results yet.