SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 91–100 of 655 papers

Title	Date	Tasks	Status
Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits	Aug 28, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders	Aug 28, 2024	Recommendation SystemsThompson Sampling	—Unverified
Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications	Aug 26, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit	Aug 8, 2024	Federated LearningThompson Sampling	CodeCode Available
Optimization-Driven Adaptive Experimentation	Aug 8, 2024	GPUThompson Sampling	—Unverified
Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic	Aug 6, 2024	Multi-Agent Path FindingSelf-Learning	CodeCode Available
Process-constrained batch Bayesian approaches for yield optimization in multi-reactor systems	Aug 5, 2024	Bayesian OptimizationThompson Sampling	CodeCode Available
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback	Jul 24, 2024	Thompson Sampling	—Unverified
Thompson Sampling Itself is Differentially Private	Jul 20, 2024	Thompson Sampling	—Unverified
Scalable Exploration via Ensemble++	Jul 18, 2024	Computational EfficiencyDecision Making	CodeCode Available

Show:10 25 50

← PrevPage 10 of 66Next →

No leaderboard results yet.