SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 191200 of 655 papers

TitleStatusHype
Bayesian Quantile and Expectile Optimisation0
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits0
Deep Contextual Multi-armed Bandits0
Deep Exploration for Recommendation Systems0
Deep Hierarchy in Bandits0
Delay-Adaptive Learning in Generalized Linear Contextual Bandits0
Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce0
Differentially Private Federated Bayesian Optimization with Distributed Exploration0
Diffusion Approximations for Thompson Sampling0
A Copula approach for hyperparameter transfer learning0
Show:102550
← PrevPage 20 of 66Next →

No leaderboard results yet.