SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 111–120 of 655 papers

Title	Date	Tasks	Status
Two-Stage Resource Allocation in Reconfigurable Intelligent Surface Assisted Hybrid Networks via Multi-Player Bandits	Jun 9, 2024	Thompson Sampling	—Unverified
Adaptively Learning to Select-Rank in Online Platforms	Jun 7, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism	Jun 6, 2024	Thompson Sampling	—Unverified
Posterior Sampling via Autoregressive Generation	May 29, 2024	ArticlesDecision Making	—Unverified
Approximate Thompson Sampling for Learning Linear Quadratic Regulators with O(T) Regret	May 29, 2024	Thompson Sampling	—Unverified
Cost-efficient Knowledge-based Question Answering with Large Language Models	May 27, 2024	Knowledge GraphsModel Selection	—Unverified
On Bits and Bandits: Quantifying the Regret-Information Trade-off	May 26, 2024	Decision MakingQuestion Answering	CodeCode Available
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff	May 26, 2024	Code RepairLanguage Modeling	—Unverified
Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits	May 24, 2024	Multi-Armed BanditsThompson Sampling	—Unverified
No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling	May 23, 2024	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 12 of 66Next →

No leaderboard results yet.