SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 321–330 of 655 papers

Title	Date	Tasks	Status
Regularized-OFU: an efficient algorithm for general contextual bandit with optimization oracles	Sep 29, 2021	Multi-Armed BanditsThompson Sampling	—Unverified
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification	Sep 29, 2021	Binary ClassificationThompson Sampling	—Unverified
Expected Improvement-based Contextual Bandits	Sep 29, 2021	Bayesian OptimizationMulti-Armed Bandits	—Unverified
Deep Exploration for Recommendation Systems	Sep 26, 2021	Recommendation SystemsThompson Sampling	—Unverified
Vaccine allocation policy optimization and budget sharing mechanism using Thompson sampling	Sep 21, 2021	Decision MakingManagement	CodeCode Available
Online Learning of Network Bottlenecks via Minimax Paths	Sep 17, 2021	Thompson Sampling	—Unverified
Machine Learning for Online Algorithm Selection under Censored Feedback	Sep 13, 2021	BIG-bench Machine LearningThompson Sampling	CodeCode Available
Thompson Sampling for Bandits with Clustered Arms	Sep 6, 2021	ClusteringThompson Sampling	—Unverified
A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits	Aug 25, 2021	Thompson Sampling	CodeCode Available
A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems	Aug 19, 2021	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 33 of 66Next →

No leaderboard results yet.