SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–160 of 655 papers

Title	Date	Tasks	Status
Bandits Under The Influence (Extended Version)	Sep 21, 2020	Recommendation SystemsThompson Sampling	—Unverified
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits	Oct 23, 2021	Decision MakingMulti-Armed Bandits	—Unverified
Bandit Policies for Reliable Cellular Network Handovers in Extreme Mobility	Oct 28, 2020	Thompson Sampling	—Unverified
Bandit Models of Human Behavior: Reward Processing in Mental Disorders	Jun 7, 2017	Decision MakingThompson Sampling	—Unverified
Analysis of Thompson Sampling for Graphical Bandits Without the Graphs	May 23, 2018	Thompson Sampling	—Unverified
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits	Sep 12, 2017	Thompson Sampling	—Unverified
A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms	Jun 3, 2021	Thompson Sampling	—Unverified
Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Bandits	Feb 7, 2024	Multi-Armed BanditsReinforcement Learning (RL)	—Unverified
Bandit Learning for Diversified Interactive Recommendation	Jul 1, 2019	Bayesian InferenceDiversity	—Unverified
Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization	May 18, 2017	global-optimizationThompson Sampling	—Unverified

Show:10 25 50

← PrevPage 16 of 66Next →

No leaderboard results yet.