SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 291300 of 655 papers

TitleStatusHype
Observation-Free Attacks on Stochastic Bandits0
Doubly Robust Thompson Sampling with Linear Payoffs0
Optimizing Conditional Value-At-Risk of Black-Box FunctionsCode0
Adaptive Gating for Single-Photon 3D Imaging0
ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision MedicineCode0
Hierarchical Bayesian Bandits0
The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle0
Maillard Sampling: Boltzmann Exploration Done Optimally0
Online Learning of Energy Consumption for Navigation of Electric Vehicles0
Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling0
Show:102550
← PrevPage 30 of 66Next →

No leaderboard results yet.