SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 655 papers

Title	Date	Tasks	Status
A Reinforcement Learning based Reset Policy for CDCL SAT Solvers	Apr 4, 2024	reinforcement-learningReinforcement Learning	—Unverified
A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems	Aug 19, 2021	Thompson Sampling	—Unverified
A Reliability-aware Multi-armed Bandit Approach to Learn and Select Users in Demand Response	Mar 20, 2020	AvgThompson Sampling	—Unverified
A resource-constrained stochastic scheduling algorithm for homeless street outreach and gleaning edible food	Mar 15, 2024	SchedulingThompson Sampling	—Unverified
A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization	Apr 1, 2016	Bayesian OptimizationThompson Sampling	—Unverified
A Simple and Optimal Policy Design with Safety against Heavy-Tailed Risk for Stochastic Bandits	Jun 7, 2022	Multi-Armed BanditsThompson Sampling	—Unverified
A study of Thompson Sampling with Parameter h	Oct 5, 2017	Thompson Sampling	—Unverified
Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits	Jun 30, 2016	Thompson Sampling	—Unverified
Asymptotically Optimal Bandits under Weighted Information	May 28, 2021	Thompson Sampling	—Unverified
An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization	Apr 12, 2016	Recommendation SystemsThompson Sampling	—Unverified

Show:10 25 50

← PrevPage 9 of 66Next →

No leaderboard results yet.