SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 511–520 of 655 papers

Title	Date	Tasks	Status
Two-Stage Resource Allocation in Reconfigurable Intelligent Surface Assisted Hybrid Networks via Multi-Player Bandits	Jun 9, 2024	Thompson Sampling	—Unverified
Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs	Feb 16, 2025	GSM8KThompson Sampling	—Unverified
Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making	May 23, 2024	Decision MakingSequential Decision Making	—Unverified
Reinforcement Learning in Credit Scoring and Underwriting	Dec 15, 2022	Decision MakingEfficient Exploration	—Unverified
Unimodal Thompson Sampling for Graph-Structured Arms	Nov 17, 2016	Thompson Sampling	—Unverified
Using Adaptive Experiments to Rapidly Help Students	Aug 10, 2022	Thompson Sampling	—Unverified
Variable Selection via Thompson Sampling	Jul 1, 2020	BIG-bench Machine LearningInterpretable Machine Learning	—Unverified
Variational Bayesian Optimistic Sampling	Oct 29, 2021	Thompson Sampling	—Unverified
WAPTS: A Weighted Allocation Probability Adjusted Thompson Sampling Algorithm for High-Dimensional and Sparse Experiment Settings	Jan 7, 2025	Thompson Sampling	—Unverified
When and Whom to Collaborate with in a Changing Environment: A Collaborative Dynamic Bandit Solution	Apr 14, 2021	Bayesian InferenceCollaborative Filtering	—Unverified

Show:10 25 50

← PrevPage 52 of 66Next →

No leaderboard results yet.