SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 581590 of 655 papers

TitleStatusHype
Atlas: Automate Online Service Configuration in Network SlicingCode0
Scalable Optimization for Wind Farm Control using Coordination GraphsCode0
Variational inference for the multi-armed contextual banditCode0
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit ApproachCode0
Mixed-Effect Thompson SamplingCode0
On the Suboptimality of Thompson Sampling in High DimensionsCode0
Randomized Value Functions via Multiplicative Normalizing FlowsCode0
Minimum Empirical Divergence for Sub-Gaussian Linear BanditsCode0
Ranking In Generalized Linear BanditsCode0
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health InterventionsCode0
Show:102550
← PrevPage 59 of 66Next →

No leaderboard results yet.