SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 101150 of 655 papers

TitleStatusHype
A Unifying Theory of Thompson Sampling for Continuous Risk-Averse BanditsCode0
Automated Creative Optimization for E-Commerce AdvertisingCode0
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized RecommendationsCode0
Simple Bayesian Algorithms for Best Arm IdentificationCode0
Mixed-Effect Thompson SamplingCode0
State-Aware Variational Thompson Sampling for Deep Q-NetworksCode0
Thompson Sampling Algorithms for Mean-Variance BanditsCode0
Thompson Sampling: An Asymptotically Optimal Finite Time AnalysisCode0
Thompson Sampling for Bandit Learning in Matching MarketsCode0
Bandit Learning with Implicit FeedbackCode0
Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood StructuresCode0
Thompson Sampling for High-Dimensional Sparse Linear Contextual BanditsCode0
Thompson Sampling for Multinomial Logit Contextual BanditsCode0
Thompson Sampling for Robust Transfer in Multi-Task BanditsCode0
Scalable Exploration via Ensemble++Code0
ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision MedicineCode0
Dynamic Assortment Selection and Pricing with Censored Preference FeedbackCode0
Tsetlin Machine for Solving Contextual Bandit ProblemsCode0
Evaluating Deep Vs. Wide & Deep Learners As Contextual Bandits For Personalized Email Promo RecommendationsCode0
Improving Portfolio Optimization Results with Bandit NetworksCode0
Distributed Thompson sampling under constrained communicationCode0
Machine Learning for Online Algorithm Selection under Censored FeedbackCode0
Adaptive Interventions with User-Defined Goals for Health Behavior ChangeCode0
Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences ConstraintsCode0
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit ApproachCode0
FedRTS: Federated Robust Pruning via Combinatorial Thompson SamplingCode0
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed BanditCode0
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health InterventionsCode0
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender SystemsCode0
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed BanditCode0
Bayesian Optimization for Categorical and Category-Specific Continuous InputsCode0
Double Thompson Sampling for Dueling BanditsCode0
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingCode0
Differentially Private Online Bayesian Estimation With Adaptive TruncationCode0
Incentivizing Exploration In Reinforcement Learning With Deep Predictive ModelsCode0
Randomized Exploration for Non-Stationary Stochastic Linear BanditsCode0
Bayesian Learning of Optimal Policies in Markov Decision Processes with Countably Infinite State-Space0
Bayesian-Guided Generation of Synthetic Microbiomes with Minimized Pathogenicity0
An Empirical Evaluation of Thompson Sampling0
Bayesian decision-making under misspecified priors with applications to meta-learning0
Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program0
Adaptive Grey-Box Fuzz-Testing with Thompson Sampling0
Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies0
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling0
Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits0
An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits0
Adaptive Gating for Single-Photon 3D Imaging0
A Combinatorial Semi-Bandit Approach to Charging Station Selection for Electric Vehicles0
Batched Thompson Sampling for Multi-Armed Bandits0
Batched Thompson Sampling0
Show:102550
← PrevPage 3 of 14Next →

No leaderboard results yet.