SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 5160 of 655 papers

TitleStatusHype
An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces0
Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring0
Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms0
Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization0
Analysis of Thompson Sampling for Graphical Bandits Without the Graphs0
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits0
Analyzing and Enhancing Queue Sampling for Energy-Efficient Remote Control of Bandits0
An Analysis of Ensemble Sampling0
An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits0
AdaptEx: A Self-Service Contextual Bandit Platform0
Show:102550
← PrevPage 6 of 66Next →

No leaderboard results yet.