SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 631–640 of 655 papers

Title	Date	Tasks	Status
Nonparametric Gaussian Mixture Models for the Multi-Armed Bandit	Aug 8, 2018	Density EstimationMulti-Armed Bandits	CodeCode Available
Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox	Oct 7, 2024	Thompson Sampling	CodeCode Available
Efficient Exploration through Bayesian Deep Q-Networks	Feb 13, 2018	Atari GamesEfficient Exploration	CodeCode Available
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations	Oct 19, 2021	Decision MakingModel Selection	CodeCode Available
Thompson Sampling for Linearly Constrained Bandits	Apr 20, 2020	Multi-Armed BanditsThompson Sampling	CodeCode Available
Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages	Aug 28, 2023	Decision MakingDecision Making Under Uncertainty	CodeCode Available
Tsetlin Machine for Solving Contextual Bandit Problems	Feb 4, 2022	Thompson Sampling	CodeCode Available
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards	Apr 28, 2023	Multi-Armed BanditsThompson Sampling	CodeCode Available
Bandit Learning with Implicit Feedback	Dec 1, 2018	Bayesian InferenceThompson Sampling	CodeCode Available
Automated Creative Optimization for E-Commerce Advertising	Feb 28, 2021	AutoMLClick-Through Rate Prediction	CodeCode Available

Show:10 25 50

← PrevPage 64 of 66Next →

No leaderboard results yet.