SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 121130 of 655 papers

TitleStatusHype
Bayesian bandits: balancing the exploration-exploitation tradeoff via double samplingCode0
Bayesian Optimization for Categorical and Category-Specific Continuous InputsCode0
Causal Bandits for Linear Structural Equation ModelsCode0
Thompson Sampling for Linearly Constrained BanditsCode0
Thompson Sampling for Robust Transfer in Multi-Task BanditsCode0
Thompson Sampling via Local UncertaintyCode0
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit ApproachCode0
Dynamic Assortment Selection and Pricing with Censored Preference FeedbackCode0
Mixed-Effect Thompson SamplingCode0
Vaccine allocation policy optimization and budget sharing mechanism using Thompson samplingCode0
Show:102550
← PrevPage 13 of 66Next →

No leaderboard results yet.