SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 576600 of 655 papers

TitleStatusHype
Reinforcement learning techniques for Outer Loop Link Adaptation in 4G/5G systems0
Streaming kernel regression with provably adaptive mean, variance, and regularization0
Counterfactual Data-Fusion for Online Reinforcement Learners0
Taming Non-stationary Bandits: A Bayesian Approach0
Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms: A Case with Bounded Regret0
Calibrated Fairness in Bandits0
A Practical Method for Solving Contextual Bandit Problems Using Decision Trees0
Bandit Models of Human Behavior: Reward Processing in Mental Disorders0
Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space0
Thompson Sampling for the MNL-Bandit0
Scalable Generalized Linear Bandits: Online Computation and Hashing0
Asynchronous Parallel Bayesian Optimisation via Thompson SamplingCode0
A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data0
AIXIjs: A Software Demo for General Reinforcement LearningCode0
Ensemble Sampling0
Posterior sampling for reinforcement learning: worst-case regret bounds0
Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization0
Context Attentive Bandits: Contextual Bandit with Restricted Context0
Multi-dueling Bandits with Dependent Arms0
Mostly Exploration-Free Algorithms for Contextual BanditsCode0
Time-Sensitive Bandit Learning and Satisficing Thompson Sampling0
Efficient Benchmarking of NLP APIs using Multi-armed Bandits0
Thompson Sampling for Linear-Quadratic Control Problems0
Horde of Bandits using Gaussian Markov Random Fields0
QoS-Aware Multi-Armed Bandits0
Show:102550
← PrevPage 24 of 27Next →

No leaderboard results yet.