SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 611620 of 655 papers

TitleStatusHype
Human collective intelligence as distributed Bayesian inference0
Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits0
Online Algorithms For Parameter Mean And Variance Estimation In Dynamic Regression Models0
Linear Bandit algorithms using the Bootstrap0
Double Thompson Sampling for Dueling BanditsCode0
An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization0
A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization0
Optimal Recommendation to Users that React: Online Learning for a Class of POMDPs0
Cascading Bandits for Large-Scale Recommendation ProblemsCode0
Simple Bayesian Algorithms for Best Arm Identification0
Show:102550
← PrevPage 62 of 66Next →

No leaderboard results yet.