SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 601625 of 655 papers

TitleStatusHype
Stacked Thompson BanditsCode0
Thompson Sampling For Stochastic Bandits with Graph Feedback0
Estimating Quality in Multi-Objective Bandits Optimization0
Exploration for Multi-task Reinforcement Learning with Deep Generative Models0
Nonparametric General Reinforcement Learning0
Linear Thompson Sampling Revisited0
Unimodal Thompson Sampling for Graph-Structured Arms0
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits0
A Formal Solution to the Grain of Truth Problem0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
Human collective intelligence as distributed Bayesian inference0
Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits0
Online Algorithms For Parameter Mean And Variance Estimation In Dynamic Regression Models0
Linear Bandit algorithms using the Bootstrap0
Double Thompson Sampling for Dueling BanditsCode0
An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization0
A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization0
Optimal Recommendation to Users that React: Online Learning for a Class of POMDPs0
Cascading Bandits for Large-Scale Recommendation ProblemsCode0
Simple Bayesian Algorithms for Best Arm Identification0
Thompson Sampling is Asymptotically Optimal in General Environments0
Convolutional Monte Carlo Rollouts in Go0
Efficient Thompson Sampling for Online Matrix-Factorization Recommendation0
Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits0
TSEB: More Efficient Thompson Sampling for Policy Learning0
Show:102550
← PrevPage 25 of 27Next →

No leaderboard results yet.