SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 301325 of 655 papers

TitleStatusHype
Risk and optimal policies in bandit experiments0
Safe Linear Leveling Bandits0
Doubly Robust Thompson Sampling with Linear Payoffs0
Observation-Free Attacks on Stochastic Bandits0
Optimizing Conditional Value-At-Risk of Black-Box FunctionsCode0
Adaptive Gating for Single-Photon 3D Imaging0
ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision MedicineCode0
Hierarchical Bayesian Bandits0
The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle0
Maillard Sampling: Boltzmann Exploration Done Optimally0
Online Learning of Energy Consumption for Navigation of Electric Vehicles0
Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling0
Variational Bayesian Optimistic Sampling0
Differentially Private Federated Bayesian Optimization with Distributed Exploration0
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits0
Diversified Sampling for Batched Bayesian Optimization with Determinantal Point Processes0
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized RecommendationsCode0
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning0
Batched Thompson Sampling0
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits0
Expected Improvement-based Contextual Bandits0
Regularized-OFU: an efficient algorithm for general contextual bandit with optimization oracles0
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification0
Deep Exploration for Recommendation Systems0
Vaccine allocation policy optimization and budget sharing mechanism using Thompson samplingCode0
Show:102550
← PrevPage 13 of 27Next →

No leaderboard results yet.