SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 71–80 of 655 papers

Title	Date	Tasks	Status	Score
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions	Dec 11, 2023	Multi-Armed BanditsOff-policy evaluation	CodeCode Available	5
Double Thompson Sampling for Dueling Bandits	Apr 25, 2016	Thompson Sampling	CodeCode Available	5
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit	Aug 8, 2024	Federated LearningThompson Sampling	CodeCode Available	5
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit	May 7, 2024	Federated LearningThompson Sampling	CodeCode Available	5
Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning	Jul 11, 2019	Thompson Sampling	CodeCode Available	5
Bayesian bandits: balancing the exploration-exploitation tradeoff via double sampling	Sep 10, 2017	Reinforcement LearningThompson Sampling	CodeCode Available	5
Bayesian Algorithms for Decentralized Stochastic Bandits	Oct 20, 2020	Thompson Sampling	CodeCode Available	5
Differentially Private Online Bayesian Estimation With Adaptive Truncation	Jan 19, 2023	Privacy PreservingSensitivity	CodeCode Available	5
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender Systems	Feb 7, 2022	Decision MakingDimensionality Reduction	CodeCode Available	5
Bandit Learning with Implicit Feedback	Dec 1, 2018	Bayesian InferenceThompson Sampling	CodeCode Available	5

Show:10 25 50

← PrevPage 8 of 66Next →

No leaderboard results yet.