SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–360 of 655 papers

Title	Date	Tasks	Status	Hype
Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models	Feb 16, 2021	Decision MakingMeta Reinforcement Learning	—Unverified	0
Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment	Feb 16, 2021	Thompson Sampling	—Unverified	0
The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling	Feb 16, 2021	Decision MakingLEMMA	—Unverified	0
Meta-Thompson Sampling	Feb 11, 2021	Efficient ExplorationMeta-Learning	—Unverified	0
On the Suboptimality of Thompson Sampling in High Dimensions	Feb 10, 2021	Thompson SamplingVocal Bursts Intensity Prediction	CodeCode Available	0
State-Aware Variational Thompson Sampling for Deep Q-Networks	Feb 7, 2021	Thompson Sampling	CodeCode Available	0
Doubly robust Thompson sampling for linear payoffs	Feb 1, 2021	Thompson Sampling	—Unverified	0
Weak Signal Asymptotics for Sequentially Randomized Experiments	Jan 25, 2021	Thompson Sampling	—Unverified	0
An empirical evaluation of active inference in multi-armed bandits	Jan 21, 2021	BIG-bench Machine LearningDecision Making	CodeCode Available	1
Scalable Optimization for Wind Farm Control using Coordination Graphs	Jan 19, 2021	Thompson Sampling	CodeCode Available	0

Show:10 25 50

← PrevPage 36 of 66Next →

No leaderboard results yet.