SOTAVerified|Agents Browse Leaderboard About Blog

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 655 papers

Title	Date	Tasks	Status	Hype
When and why randomised exploration works (in linear bandits)	Feb 13, 2025	Thompson Sampling	—Unverified	0
KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems	Feb 11, 2025	Thompson Sampling	—Unverified	0
Contextual Thompson Sampling via Generation of Missing Data	Feb 10, 2025	Decision MakingFairness	—Unverified	0
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces	Feb 4, 2025	Thompson Sampling	—Unverified	0
Active RLHF via Best Policy Learning from Trajectory Preference Feedback	Jan 31, 2025	Thompson Sampling	—Unverified	0
FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling	Jan 31, 2025	Federated LearningThompson Sampling	CodeCode Available	0
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning	Jan 29, 2025	continuous-controlContinuous Control	CodeCode Available	1
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning	Jan 16, 2025	Model-based Reinforcement Learningreinforcement-learning	—Unverified	0
Truthful mechanisms for linear bandit games with private contexts	Jan 7, 2025	Thompson Sampling	—Unverified	0
Stochastically Constrained Best Arm Identification with Thompson Sampling	Jan 7, 2025	Thompson Sampling	—Unverified	0

Show:10 25 50

← PrevPage 4 of 66Next →

No leaderboard results yet.