SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 461–470 of 655 papers

Title	Date	Tasks	Status
Incentivized Exploration for Multi-Armed Bandits under Reward Drift	Nov 12, 2019	Multi-Armed BanditsThompson Sampling	—Unverified
Safe Linear Thompson Sampling with Side Information	Nov 6, 2019	Thompson Sampling	—Unverified
On Batch Bayesian Optimization	Nov 4, 2019	Bayesian OptimizationThompson Sampling	—Unverified
On Online Learning in Kernelized Markov Decision Processes	Nov 4, 2019	Thompson Sampling	—Unverified
Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints	Nov 2, 2019	Bayesian OptimizationDecision Making	—Unverified
Thompson Sampling via Local Uncertainty	Oct 30, 2019	Decision MakingMulti-Armed Bandits	CodeCode Available
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification	Oct 24, 2019	Thompson Sampling	—Unverified
Thompson Sampling in Non-Episodic Restless Bandits	Oct 12, 2019	Open-Ended Question AnsweringThompson Sampling	—Unverified
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems	Oct 11, 2019	Thompson Sampling	CodeCode Available
Regret Analysis of Bandit Problems with Causal Background Knowledge	Oct 11, 2019	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 47 of 66Next →

No leaderboard results yet.