SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 501510 of 655 papers

TitleStatusHype
Thompson Sampling with Virtual Helping Agents0
Time-Sensitive Bandit Learning and Satisficing Thompson Sampling0
Top Two Algorithms Revisited0
Towards Optimal Algorithms for Prediction with Expert Advice0
Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework0
Tree Ensembles for Contextual Bandits0
Truthful mechanisms for linear bandit games with private contexts0
TSEB: More Efficient Thompson Sampling for Policy Learning0
TSEC: a framework for online experimentation under experimental constraints0
TS-UCB: Improving on Thompson Sampling With Little to No Additional Computation0
Show:102550
← PrevPage 51 of 66Next →

No leaderboard results yet.