SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 476500 of 655 papers

TitleStatusHype
An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits0
Online Causal Inference for Advertising in Real-Time Bidding Auctions0
A Batched Multi-Armed Bandit Approach to News Headline Testing0
A Bayesian Choice Model for Eliminating Feedback Loops0
Thompson Sampling with Approximate Inference0
Scaling Multi-Armed Bandit Algorithms0
Convergence Rates of Posterior Distributions in Markov Decision Process0
Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop PlanningCode0
Thompson Sampling on Symmetric α-Stable Bandits0
Thompson Sampling for Combinatorial Network Optimization in Unknown Environments0
Mixed-Variable Bayesian Optimization0
Bandit Learning for Diversified Interactive Recommendation0
Thompson Sampling for Adversarial Bit Prediction0
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning0
Sparse Spectrum Gaussian Process for Bayesian Optimization0
Stochastic Neural Network with Kronecker Flow0
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation0
Regret Bounds for Thompson Sampling in Episodic Restless Bandit ProblemsCode0
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio0
Feedback graph regret bounds for Thompson Sampling and UCB0
Adaptive Model Selection Framework: An Application to Airline Pricing0
Adaptive Sensor Placement for Continuous Spaces0
On the Performance of Thompson Sampling on Logistic Bandits0
Memory Bounded Open-Loop Planning in Large POMDPs using Thompson SamplingCode0
AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning0
Show:102550
← PrevPage 20 of 27Next →

No leaderboard results yet.