SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 281290 of 655 papers

TitleStatusHype
Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions0
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits0
Graph Neural Thompson Sampling0
Feedback graph regret bounds for Thompson Sampling and UCB0
Greedy Bandits with Sampled Context0
Greedy k-Center from Noisy Distance Samples0
GuideBoot: Guided Bootstrap for Deep Contextual Bandits0
GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search0
gym-saturation: Gymnasium environments for saturation provers (System description)0
Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search0
Show:102550
← PrevPage 29 of 66Next →

No leaderboard results yet.