SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 2130 of 655 papers

TitleStatusHype
Mercer Features for Efficient Combinatorial Bayesian OptimizationCode1
Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural ProcessesCode1
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood SearchCode1
Optimizing Posterior Samples for Bayesian Optimization via RootfindingCode1
On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial AttacksCode1
qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson samplingCode1
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start UsersCode1
Bayesian Optimization for Categorical and Category-Specific Continuous InputsCode0
Scalable Exploration via Ensemble++Code0
Bayesian bandits: balancing the exploration-exploitation tradeoff via double samplingCode0
Show:102550
← PrevPage 3 of 66Next →

No leaderboard results yet.