SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 641–650 of 655 papers

Title	Date	Tasks	Status
Better Optimism By Bayes: Adaptive Planning with Rich Models	Feb 9, 2014	Model-based Reinforcement LearningReinforcement Learning	—Unverified
Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search	Dec 1, 2013	Thompson Sampling	—Unverified
Eluder Dimension and the Sample Complexity of Optimistic Exploration	Dec 1, 2013	Thompson Sampling	—Unverified
Thompson Sampling for Complex Bandit Problems	Nov 3, 2013	Thompson Sampling	—Unverified
Thompson Sampling for Online Learning with Linear Experts	Nov 3, 2013	Thompson Sampling	—Unverified
Generalized Thompson Sampling for Contextual Bandits	Oct 27, 2013	Multi-Armed BanditsThompson Sampling	—Unverified
Thompson Sampling in Dynamic Systems for Contextual Bandit Problems	Oct 17, 2013	Thompson Sampling	—Unverified
Thompson Sampling for 1-Dimensional Exponential Family Bandits	Jul 12, 2013	Thompson Sampling	—Unverified
Cover Tree Bayesian Reinforcement Learning	May 8, 2013	reinforcement-learningReinforcement Learning	—Unverified
Prior-free and prior-dependent regret bounds for Thompson Sampling	Apr 21, 2013	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 65 of 66Next →

No leaderboard results yet.