SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 171–180 of 655 papers

Title	Date	Tasks	Status
Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications	Aug 26, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model	Jan 31, 2019	Recommendation SystemsThompson Sampling	—Unverified
Contextual Multi-Armed Bandits for Causal Marketing	Oct 2, 2018	Causal Inferencecounterfactual	—Unverified
Contextual Thompson Sampling via Generation of Missing Data	Feb 10, 2025	Decision MakingFairness	—Unverified
Convergence Rates of Posterior Distributions in Markov Decision Process	Jul 22, 2019	Thompson Sampling	—Unverified
Convolutional Monte Carlo Rollouts in Go	Dec 10, 2015	GPUThompson Sampling	—Unverified
Cost Aware Asynchronous Multi-Agent Active Search	Oct 5, 2022	Decision MakingThompson Sampling	—Unverified
Cost-efficient Knowledge-based Question Answering with Large Language Models	May 27, 2024	Knowledge GraphsModel Selection	—Unverified
Asymptotically Optimal Bandits under Weighted Information	May 28, 2021	Thompson Sampling	—Unverified
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits	Dec 3, 2024	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 18 of 66Next →

No leaderboard results yet.