SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 411–420 of 655 papers

Title	Date	Tasks	Status
Scalable Generalized Linear Bandits: Online Computation and Hashing	Jun 1, 2017	Thompson Sampling	—Unverified
Scalable Neural Contextual Bandit for Recommender Systems	Jun 26, 2023	Recommendation SystemsThompson Sampling	—Unverified
Scalable regret for learning to control network-coupled subsystems with unknown dynamics	Aug 18, 2021	Thompson Sampling	—Unverified
Scalable Thompson Sampling using Sparse Gaussian Process Models	Jun 9, 2020	Thompson Sampling	—Unverified
Scalable Thompson Sampling via Optimal Transport	Feb 19, 2019	Decision MakingSequential Decision Making	—Unverified
Scaling Multi-Armed Bandit Algorithms	Jul 25, 2019	Multi-Armed BanditsSequential Decision Making	—Unverified
Screening for an Infectious Disease as a Problem in Stochastic Control	Nov 1, 2020	Thompson Sampling	—Unverified
Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization	May 17, 2022	Multi-Armed BanditsThompson Sampling	—Unverified
Sequential Best-Arm Identification with Application to Brain-Computer Interface	May 17, 2023	Brain Computer InterfaceEEG	—Unverified
Sequential Matrix Completion	Oct 23, 2017	Collaborative FilteringMatrix Completion	—Unverified

Show:10 25 50

← PrevPage 42 of 66Next →

No leaderboard results yet.