SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 655 papers

Title	Date	Tasks	Status	Score
Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo	Jan 22, 2024	Thompson Sampling	CodeCode Available	5
Efficient Exploration through Bayesian Deep Q-Networks	Feb 13, 2018	Atari GamesEfficient Exploration	CodeCode Available	5
Evaluating Deep Vs. Wide & Deep Learners As Contextual Bandits For Personalized Email Promo Recommendations	Jan 31, 2022	Multi-Armed BanditsThompson Sampling	CodeCode Available	5
Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling	Apr 26, 2022	Decision MakingEvolutionary Algorithms	CodeCode Available	5
Fast, Precise Thompson Sampling for Bayesian Optimization	Nov 26, 2024	Bayesian OptimizationSTS	CodeCode Available	5
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender Systems	Feb 7, 2022	Decision MakingDimensionality Reduction	CodeCode Available	5
Bandit Learning with Implicit Feedback	Dec 1, 2018	Bayesian InferenceThompson Sampling	CodeCode Available	5
Improving Portfolio Optimization Results with Bandit Networks	Oct 5, 2024	Portfolio OptimizationRecommendation Systems	CodeCode Available	5
Information-Directed Exploration for Deep Reinforcement Learning	Dec 18, 2018	Atari GamesDeep Reinforcement Learning	CodeCode Available	5
Automated Creative Optimization for E-Commerce Advertising	Feb 28, 2021	AutoMLClick-Through Rate Prediction	CodeCode Available	5

Show:10 25 50

← PrevPage 9 of 66Next →

No leaderboard results yet.