SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–610 of 655 papers

Title	Date	Tasks	Status
Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo	Jan 22, 2024	Thompson Sampling	CodeCode Available
Thompson Sampling for Bandit Learning in Matching Markets	Apr 26, 2022	Multi-Armed BanditsThompson Sampling	CodeCode Available
Differentially Private Online Bayesian Estimation With Adaptive Truncation	Jan 19, 2023	Privacy PreservingSensitivity	CodeCode Available
Multi-Agent Active Search using Realistic Depth-Aware Noise Model	Nov 9, 2020	object-detectionObject Detection	CodeCode Available
Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays	Jun 2, 2015	Thompson Sampling	CodeCode Available
Multi-armed bandits for resource efficient, online optimization of language model pre-training: the use case of dynamic masking	Mar 24, 2022	Bayesian OptimizationDecision Making	CodeCode Available
Optimal Regret Is Achievable with Bounded Approximate Inference Error: An Enhanced Bayesian Upper Confidence Bound Framework	Jan 31, 2022	Bayesian InferenceMulti-Armed Bandits	CodeCode Available
Improving Portfolio Optimization Results with Bandit Networks	Oct 5, 2024	Portfolio OptimizationRecommendation Systems	CodeCode Available
Thompson Sampling for Robust Transfer in Multi-Task Bandits	Jun 17, 2022	Multi-Task LearningThompson Sampling	CodeCode Available
Sequential Monte Carlo Bandits	Aug 8, 2018	Decision MakingSequential Decision Making	CodeCode Available

Show:10 25 50

← PrevPage 61 of 66Next →

No leaderboard results yet.