SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 231–240 of 655 papers

Title	Date	Tasks	Status
Asymptotically Optimal Bandits under Weighted Information	May 28, 2021	Thompson Sampling	—Unverified
Efficient Learning in Large-Scale Combinatorial Semi-Bandits	Jun 28, 2014	Thompson Sampling	—Unverified
A General Theory of the Stochastic Linear Bandit and Its Applications	Feb 12, 2020	Multi-Armed BanditsThompson Sampling	—Unverified
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling	Oct 7, 2024	continuous-controlContinuous Control	—Unverified
Efficient Multivariate Bandit Algorithm with Path Planning	Sep 6, 2019	Heuristic SearchThompson Sampling	—Unverified
Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling	Aug 24, 2020	Deep Reinforcement LearningThompson Sampling	—Unverified
Cost-efficient Knowledge-based Question Answering with Large Language Models	May 27, 2024	Knowledge GraphsModel Selection	—Unverified
Efficient Thompson Sampling for Online Matrix-Factorization Recommendation	Dec 1, 2015	Collaborative FilteringRecommendation Systems	—Unverified
Cost Aware Asynchronous Multi-Agent Active Search	Oct 5, 2022	Decision MakingThompson Sampling	—Unverified
Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits	Jun 30, 2016	Thompson Sampling	—Unverified

Show:10 25 50

← PrevPage 24 of 66Next →

No leaderboard results yet.