SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 531–540 of 655 papers

Title	Date	Tasks	Status
Nonparametric Gaussian Mixture Models for the Multi-Armed Bandit	Aug 8, 2018	Density EstimationMulti-Armed Bandits	CodeCode Available
Sequential Monte Carlo Bandits	Aug 8, 2018	Decision MakingSequential Decision Making	CodeCode Available
Deep Contextual Multi-armed Bandits	Jul 25, 2018	MarketingMulti-Armed Bandits	—Unverified
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits	Jul 19, 2018	Multi-Armed BanditsThompson Sampling	—Unverified
Optimization of a SSP's Header Bidding Strategy using Thompson Sampling	Jul 9, 2018	Thompson Sampling	—Unverified
Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems	Jul 1, 2018	Reinforcement LearningThompson Sampling	—Unverified
On The Differential Privacy of Thompson Sampling With Gaussian Prior	Jun 24, 2018	Thompson Sampling	—Unverified
Randomized Value Functions via Multiplicative Normalizing Flows	Jun 6, 2018	Efficient ExplorationThompson Sampling	CodeCode Available
Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling	Jun 4, 2018	Reinforcement LearningReinforcement Learning (RL)	—Unverified
An Information-Theoretic Analysis for Thompson Sampling with Many Actions	May 30, 2018	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 54 of 66Next →

No leaderboard results yet.