SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–655 of 655 papers

Title	Date	Tasks	Status
Thompson Sampling for Contextual Bandits with Linear Payoffs	Sep 15, 2012	Multi-Armed BanditsThompson Sampling	CodeCode Available
Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic	Aug 6, 2024	Multi-Agent Path FindingSelf-Learning	CodeCode Available
AIXIjs: A Software Demo for General Reinforcement Learning	May 22, 2017	General Reinforcement LearningOpenAI Gym	CodeCode Available
Thompson Sampling Algorithms for Mean-Variance Bandits	Feb 1, 2020	Decision MakingThompson Sampling	CodeCode Available
Evaluating Deep Vs. Wide & Deep Learners As Contextual Bandits For Personalized Email Promo Recommendations	Jan 31, 2022	Multi-Armed BanditsThompson Sampling	CodeCode Available

Show:10 25 50

← PrevPage 14 of 14Next →

No leaderboard results yet.