SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 281–290 of 655 papers

Title	Date	Tasks	Status	Hype
Optimal Regret Is Achievable with Bounded Approximate Inference Error: An Enhanced Bayesian Upper Confidence Bound Framework	Jan 31, 2022	Bayesian InferenceMulti-Armed Bandits	CodeCode Available	0
Evaluating Deep Vs. Wide & Deep Learners As Contextual Bandits For Personalized Email Promo Recommendations	Jan 31, 2022	Multi-Armed BanditsThompson Sampling	CodeCode Available	0
Modeling Human Exploration Through Resource-Rational Reinforcement Learning	Jan 27, 2022	Meta-Learningreinforcement-learning	CodeCode Available	0
Augmented RBMLE-UCB Approach for Adaptive Control of Linear Quadratic Systems	Jan 25, 2022	parameter estimationThompson Sampling	—Unverified	0
IBAC: An Intelligent Dynamic Bandwidth Channel Access Avoiding Outside Warning Range Problem	Jan 15, 2022	Thompson Sampling	—Unverified	0
On Dynamic Pricing with Covariates	Dec 25, 2021	Thompson Sampling	—Unverified	0
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization	Dec 15, 2021	Thompson Sampling	—Unverified	0
Safe Linear Leveling Bandits	Dec 13, 2021	Multi-Armed BanditsThompson Sampling	—Unverified	0
Risk and optimal policies in bandit experiments	Dec 13, 2021	Dimensionality ReductionThompson Sampling	—Unverified	0
Bayesian Optimization over Permutation Spaces	Dec 2, 2021	Bayesian OptimizationHeuristic Search	CodeCode Available	1

Show:10 25 50

← PrevPage 29 of 66Next →

No leaderboard results yet.