SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 191200 of 655 papers

TitleStatusHype
Efficiently Tackling Million-Dimensional Multiobjective Problems: A Direction Sampling and Fine-Tuning Approach0
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms0
GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search0
Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches0
Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling0
A Unified and Efficient Coordinating Framework for Autonomous DBMS Tuning0
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms0
Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors0
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models0
When Combinatorial Thompson Sampling meets Approximation Regret0
Show:102550
← PrevPage 20 of 66Next →

No leaderboard results yet.