SOTAVerified|Agents Browse Leaderboard About Blog

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–50 of 655 papers

Title	Date	Tasks	Status
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits	Oct 23, 2021	Decision MakingMulti-Armed Bandits	—Unverified
An Analysis of Ensemble Sampling	Mar 2, 2022	Thompson Sampling	—Unverified
Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals	Dec 16, 2020	Thompson Sampling	—Unverified
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms	Mar 10, 2023	Thompson Sampling	—Unverified
Accelerating Grasp Exploration by Leveraging Learned Priors	Nov 11, 2020	ObjectThompson Sampling	—Unverified
A General Theory of the Stochastic Linear Bandit and Its Applications	Feb 12, 2020	Multi-Armed BanditsThompson Sampling	—Unverified
A Formal Solution to the Grain of Truth Problem	Sep 16, 2016	Thompson Sampling	—Unverified
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization	Dec 15, 2021	Thompson Sampling	—Unverified
Aligning AI Agents via Information-Directed Sampling	Oct 18, 2024	Thompson Sampling	—Unverified
AdaptEx: A Self-Service Contextual Bandit Platform	Aug 8, 2023	Multi-Armed BanditsThompson Sampling	—Unverified

Show:10 25 50

← PrevPage 5 of 66Next →

No leaderboard results yet.