SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 511–520 of 655 papers

Title	Date	Tasks	Status
KLUCB Approach to Copeland Bandits	Feb 7, 2019	Information RetrievalReinforcement Learning	—Unverified
First-Order Bayesian Regret Analysis of Thompson Sampling	Feb 2, 2019	Combinatorial OptimizationThompson Sampling	—Unverified
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model	Jan 31, 2019	Recommendation SystemsThompson Sampling	—Unverified
Thompson Sampling for a Fatigue-aware Online Recommendation System	Jan 23, 2019	Thompson Sampling	CodeCode Available
Parallel Contextual Bandits in Wireless Handover Optimization	Jan 21, 2019	Multi-Armed BanditsThompson Sampling	—Unverified
Information-Directed Exploration for Deep Reinforcement Learning	Dec 18, 2018	Atari GamesDeep Reinforcement Learning	CodeCode Available
MergeDTS: A Method for Effective Large-Scale Online Ranker Evaluation	Dec 11, 2018	Information RetrievalOnline Ranker Evaluation	CodeCode Available
Thompson Sampling for Noncompliant Bandits	Dec 3, 2018	Thompson Sampling	—Unverified
Bandit Learning with Implicit Feedback	Dec 1, 2018	Bayesian InferenceThompson Sampling	CodeCode Available
Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks	Nov 27, 2018	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 52 of 66Next →

No leaderboard results yet.