SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 401410 of 655 papers

TitleStatusHype
Position-Based Multiple-Play Bandits with Thompson Sampling0
Bandit Change-Point Detection for Real-Time Monitoring High-Dimensional Data Under Sampling Control0
Partially Observable Online Change Detection via Smooth-Sparse Decomposition0
Bandits Under The Influence (Extended Version)0
Causal Bandits without prior knowledge using separating sets0
Thompson Sampling for Unsupervised Sequential Selection0
A Change-Detection Based Thompson Sampling Framework for Non-Stationary Bandits0
Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling0
Contextual Bandits for Advertising Budget Allocation0
Near Optimal Adversarial Attacks on Stochastic Bandits and Defenses with Smoothed Responses0
Show:102550
← PrevPage 41 of 66Next →

No leaderboard results yet.