SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 271280 of 655 papers

TitleStatusHype
Fourier Representations for Black-Box Optimization over Categorical Variables0
Freshness-Aware Thompson Sampling0
From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement Learning with Contextual Information0
Fully Distributed Bayesian Optimization with Stochastic Policies0
Gaussian Process Thompson Sampling via Rootfinding0
Generalized Bayesian deep reinforcement learning0
Generalized Probabilistic Bisection for Stochastic Root-Finding0
Generalized Regret Analysis of Thompson Sampling using Fractional Posteriors0
Generalized Thompson Sampling for Contextual Bandits0
Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search0
Show:102550
← PrevPage 28 of 66Next →

No leaderboard results yet.