SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 471480 of 655 papers

TitleStatusHype
Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling0
On Multi-Armed Bandit Designs for Dose-Finding Clinical Trials0
On Online Learning in Kernelized Markov Decision Processes0
On The Differential Privacy of Thompson Sampling With Gaussian Prior0
On the Importance of Uncertainty in Decision-Making with Large Language Models0
On the Performance of Thompson Sampling on Logistic Bandits0
On the Prior Sensitivity of Thompson Sampling0
On Thompson Sampling for Smoother-than-Lipschitz Bandits0
On Thompson Sampling with Langevin Algorithms0
On Frequentist Regret of Linear Thompson Sampling0
Show:102550
← PrevPage 48 of 66Next →

No leaderboard results yet.