SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 161–170 of 655 papers

Title	Date	Tasks	Status
Constrained Contextual Bandit Learning for Adaptive Radar Waveform Selection	Mar 9, 2021	Thompson Sampling	—Unverified
Constrained Thompson Sampling for Real-Time Electricity Pricing with Grid Reliability Constraints	Jun 17, 2020	Thompson Sampling	—Unverified
Constrained Thompson Sampling for Wireless Link Optimization	Feb 28, 2019	Thompson Sampling	—Unverified
A Reinforcement Learning based Reset Policy for CDCL SAT Solvers	Apr 4, 2024	reinforcement-learningReinforcement Learning	—Unverified
A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems	Aug 19, 2021	Thompson Sampling	—Unverified
Context Attentive Bandits: Contextual Bandit with Restricted Context	May 10, 2017	Recommendation SystemsThompson Sampling	—Unverified
Context Attribution with Multi-Armed Bandit Optimization	Jun 24, 2025	Thompson Sampling	—Unverified
Adaptive Portfolio by Solving Multi-armed Bandit via Thompson Sampling	Nov 13, 2019	Decision MakingManagement	—Unverified
Contextual Bandits for Advertising Budget Allocation	Aug 22, 2020	MarketingMulti-Armed Bandits	—Unverified
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits	Dec 3, 2024	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 17 of 66Next →

No leaderboard results yet.