SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 111120 of 655 papers

TitleStatusHype
Bayesian bandits: balancing the exploration-exploitation tradeoff via double samplingCode0
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender SystemsCode0
Cascading Bandits for Large-Scale Recommendation ProblemsCode0
Scalable Optimization for Wind Farm Control using Coordination GraphsCode0
Differentially Private Online Bayesian Estimation With Adaptive TruncationCode0
Fast, Precise Thompson Sampling for Bayesian OptimizationCode0
Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted AveragesCode0
Stacked Thompson BanditsCode0
Bayesian Algorithms for Decentralized Stochastic BanditsCode0
Nonparametric Gaussian Mixture Models for the Multi-Armed BanditCode0
Show:102550
← PrevPage 12 of 66Next →

No leaderboard results yet.