SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 401425 of 655 papers

TitleStatusHype
Robust Dynamic Assortment Optimization in the Presence of Outlier Customers0
Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments0
Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks0
Safe Linear Leveling Bandits0
Safe Linear Thompson Sampling with Side Information0
Sample-based Dynamic Hierarchical Transformer with Layer and Head Flexibility via Contextual Bandit0
The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity0
Sampling Acquisition Functions for Batch Bayesian Optimization0
Satisficing in Time-Sensitive Bandit Learning0
Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype0
Scalable Generalized Linear Bandits: Online Computation and Hashing0
Scalable Neural Contextual Bandit for Recommender Systems0
Scalable regret for learning to control network-coupled subsystems with unknown dynamics0
Scalable Thompson Sampling using Sparse Gaussian Process Models0
Scalable Thompson Sampling via Optimal Transport0
Scaling Multi-Armed Bandit Algorithms0
Screening for an Infectious Disease as a Problem in Stochastic Control0
Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization0
Sequential Best-Arm Identification with Application to Brain-Computer Interface0
Sequential Matrix Completion0
Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling0
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms0
Simple Bayesian Algorithms for Best Arm Identification0
Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling0
Sliding-Window Thompson Sampling for Non-Stationary Settings0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.