SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 361370 of 655 papers

TitleStatusHype
Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment0
The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling0
Meta-Thompson Sampling0
On the Suboptimality of Thompson Sampling in High DimensionsCode0
State-Aware Variational Thompson Sampling for Deep Q-NetworksCode0
Doubly robust Thompson sampling for linear payoffs0
Weak Signal Asymptotics for Sequentially Randomized Experiments0
Scalable Optimization for Wind Farm Control using Coordination GraphsCode0
TSEC: a framework for online experimentation under experimental constraints0
Deciding What to Learn: A Rate-Distortion Approach0
Show:102550
← PrevPage 37 of 66Next →

No leaderboard results yet.