SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 521530 of 655 papers

TitleStatusHype
When and why randomised exploration works (in linear bandits)0
When Combinatorial Thompson Sampling meets Approximation Regret0
Practical Batch Bayesian Sampling Algorithms for Online Adaptive Traffic Experimentation0
Zero-Inflated Bandits0
A Bandit Approach to Online Pricing for Heterogeneous Edge Resource Allocation0
A Batched Multi-Armed Bandit Approach to News Headline Testing0
Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Bandits0
A Bayesian Choice Model for Eliminating Feedback Loops0
Accelerating Grasp Exploration by Leveraging Learned Priors0
A Change-Detection Based Thompson Sampling Framework for Non-Stationary Bandits0
Show:102550
← PrevPage 53 of 66Next →

No leaderboard results yet.