SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 451460 of 655 papers

TitleStatusHype
Thompson Sampling Achieves O(T) Regret in Linear Quadratic Control0
Thompson Sampling with Approximate Inference0
Thompson Sampling and Approximate Inference0
Analysis of Thompson Sampling for Controlling Unknown Linear Diffusion Processes0
Thompson Sampling for 1-Dimensional Exponential Family Bandits0
Thompson Sampling for Adversarial Bit Prediction0
Thompson Sampling for Bandits with Clustered Arms0
Thompson Sampling for Budgeted Multi-armed Bandits0
Thompson Sampling Algorithms for Cascading Bandits0
Thompson Sampling for Combinatorial Network Optimization in Unknown Environments0
Show:102550
← PrevPage 46 of 66Next →

No leaderboard results yet.