SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 631–640 of 655 papers

Title	Date	Tasks	Status
Thompson Sampling for Budgeted Multi-armed Bandits	May 1, 2015	Multi-Armed BanditsThompson Sampling	—Unverified
Evaluation of Explore-Exploit Policies in Multi-result Ranking Systems	Apr 28, 2015	News RecommendationThompson Sampling	—Unverified
A Note on Information-Directed Sampling and Thompson Sampling	Mar 24, 2015	Thompson Sampling	—Unverified
Bandit Convex Optimization: sqrtT Regret in One Dimension	Feb 23, 2015	Thompson Sampling	—Unverified
Thompson sampling with the online bootstrap	Oct 15, 2014	Thompson Sampling	—Unverified
Freshness-Aware Thompson Sampling	Sep 29, 2014	Recommendation SystemsThompson Sampling	—Unverified
Towards Optimal Algorithms for Prediction with Expert Advice	Sep 10, 2014	PredictionThompson Sampling	—Unverified
Thompson Sampling for Learning Parameterized Markov Decision Processes	Jun 29, 2014	Formreinforcement-learning	—Unverified
Efficient Learning in Large-Scale Combinatorial Semi-Bandits	Jun 28, 2014	Thompson Sampling	—Unverified
An Information-Theoretic Analysis of Thompson Sampling	Mar 21, 2014	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 64 of 66Next →

No leaderboard results yet.