SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 271–280 of 655 papers

Title	Date	Tasks	Status
Non-Stationary Bandit Learning via Predictive Sampling	May 4, 2022	AttributeThompson Sampling	—Unverified
Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling	Apr 26, 2022	Decision MakingEvolutionary Algorithms	CodeCode Available
Thompson Sampling for Bandit Learning in Matching Markets	Apr 26, 2022	Multi-Armed BanditsThompson Sampling	CodeCode Available
On Kernelized Multi-Armed Bandits with Constraints	Mar 29, 2022	Multi-Armed BanditsThompson Sampling	—Unverified
Multi-armed bandits for resource efficient, online optimization of language model pre-training: the use case of dynamic masking	Mar 24, 2022	Bayesian OptimizationDecision Making	CodeCode Available
Thompson Sampling on Asymmetric α-Stable Bandits	Mar 19, 2022	reinforcement-learningReinforcement Learning (RL)	—Unverified
Regenerative Particle Thompson Sampling	Mar 15, 2022	Thompson Sampling	—Unverified
Multi-Agent Active Search using Detection and Location Uncertainty	Mar 9, 2022	Decision MakingDisaster Response	—Unverified
An Analysis of Ensemble Sampling	Mar 2, 2022	Thompson Sampling	—Unverified
Partial Likelihood Thompson Sampling	Mar 2, 2022	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 28 of 66Next →

No leaderboard results yet.