SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 655 papers

Title	Date	Tasks	Status
A Batched Multi-Armed Bandit Approach to News Headline Testing	Aug 17, 2019	ArticlesThompson Sampling	—Unverified
Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine	May 22, 2025	Thompson Sampling	—Unverified
Deciding What to Learn: A Rate-Distortion Approach	Jan 15, 2021	Decision MakingSequential Decision Making	—Unverified
Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents	Jan 6, 2024	Decision MakingThompson Sampling	—Unverified
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits	Oct 1, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals	Dec 16, 2020	Thompson Sampling	—Unverified
Debiasing Samples from Online Learning Using Bootstrap	Jul 31, 2021	Off-policy evaluationThompson Sampling	—Unverified
Asymptotic Convergence of Thompson Sampling	Nov 8, 2020	Multi-Armed BanditsThompson Sampling	—Unverified
Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models	Nov 22, 2017	Multi-Armed BanditsResponse Generation	—Unverified
Cover Tree Bayesian Reinforcement Learning	May 8, 2013	reinforcement-learningReinforcement Learning	—Unverified

Show:10 25 50

← PrevPage 19 of 66Next →

No leaderboard results yet.