SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 501525 of 655 papers

TitleStatusHype
Randomised Bayesian Least-Squares Policy Iteration0
Sampling Acquisition Functions for Batch Bayesian Optimization0
On Multi-Armed Bandit Designs for Dose-Finding Clinical Trials0
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy CriticsCode0
Meta Dynamic Pricing: Transfer Learning Across Experiments0
Constrained Thompson Sampling for Wireless Link Optimization0
Fully Distributed Bayesian Optimization with Stochastic Policies0
Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes0
Scalable Thompson Sampling via Optimal Transport0
Thompson Sampling with Information Relaxation PenaltiesCode0
KLUCB Approach to Copeland Bandits0
First-Order Bayesian Regret Analysis of Thompson Sampling0
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model0
Thompson Sampling for a Fatigue-aware Online Recommendation SystemCode0
Parallel Contextual Bandits in Wireless Handover Optimization0
Information-Directed Exploration for Deep Reinforcement LearningCode0
MergeDTS: A Method for Effective Large-Scale Online Ranker EvaluationCode0
Thompson Sampling for Noncompliant Bandits0
Bandit Learning with Implicit FeedbackCode0
Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks0
Adapting multi-armed bandits policies to contextual bandits scenariosCode0
Thompson Sampling for Pursuit-Evasion Problems0
Practical Bayesian Learning of Neural Networks via Adaptive Optimisation MethodsCode0
A Unified Approach to Translate Classical Bandit Algorithms to the Structured Bandit Setting0
Combining Bayesian Optimization and Lipschitz Optimization0
Show:102550
← PrevPage 21 of 27Next →

No leaderboard results yet.