SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 561–570 of 655 papers

Title	Date	Tasks	Status
Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates	Nov 9, 2017	Thompson Sampling	—Unverified
Information Directed Sampling for Stochastic Bandits with Graph Feedback	Nov 8, 2017	Decision MakingThompson Sampling	—Unverified
The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems	Nov 5, 2017	Thompson Sampling	—Unverified
Generalized Probabilistic Bisection for Stochastic Root-Finding	Nov 2, 2017	Thompson Sampling	—Unverified
Minimal Exploration in Structured Stochastic Bandits	Nov 1, 2017	Thompson Sampling	—Unverified
Sequential Matrix Completion	Oct 23, 2017	Collaborative FilteringMatrix Completion	—Unverified
A study of Thompson Sampling with Parameter h	Oct 5, 2017	Thompson Sampling	—Unverified
Learning Unknown Markov Decision Processes: A Thompson Sampling Approach	Sep 14, 2017	Reinforcement LearningThompson Sampling	—Unverified
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits	Sep 12, 2017	Thompson Sampling	—Unverified
Variational inference for the multi-armed contextual bandit	Sep 10, 2017	Multi-Armed BanditsReinforcement Learning	CodeCode Available

Show:10 25 50

← PrevPage 57 of 66Next →

No leaderboard results yet.