SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 281290 of 655 papers

TitleStatusHype
Partial Likelihood Thompson Sampling0
Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework0
Thompson Sampling with Unrestricted Delays0
Double Thompson Sampling in Finite stochastic Games0
Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation0
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network0
Synthetically Controlled Bandits0
Remote Contextual Bandits0
Fourier Representations for Black-Box Optimization over Categorical Variables0
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender SystemsCode0
Show:102550
← PrevPage 29 of 66Next →

No leaderboard results yet.