SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 471480 of 655 papers

TitleStatusHype
Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors0
Thompson Sampling for Linear-Quadratic Control Problems0
Thompson sampling for linear quadratic mean-field teams0
Thompson Sampling for Noncompliant Bandits0
Thompson Sampling for Online Learning with Linear Experts0
Thompson Sampling for Parameterized Markov Decision Processes with Uninformative Actions0
Thompson Sampling for Pursuit-Evasion Problems0
Thompson Sampling for Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit0
Thompson Sampling For Stochastic Bandits with Graph Feedback0
Thompson Sampling for Stochastic Bandits with Noisy Contexts: An Information-Theoretic Regret Analysis0
Show:102550
← PrevPage 48 of 66Next →

No leaderboard results yet.