SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 411–420 of 655 papers

Title	Date	Tasks	Status
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits	May 27, 2022	Multi-Armed BanditsThompson Sampling	—Unverified
Linear Bandit algorithms using the Bootstrap	May 4, 2016	Thompson Sampling	—Unverified
Linear Thompson Sampling Revisited	Nov 20, 2016	Thompson Sampling	—Unverified
Little Exploration is All You Need	Oct 26, 2023	AllThompson Sampling	—Unverified
Maillard Sampling: Boltzmann Exploration Done Optimally	Nov 5, 2021	counterfactualThompson Sampling	—Unverified
Making RL with Preference-based Feedback Efficient via Randomization	Oct 23, 2023	Active LearningThompson Sampling	—Unverified
Making Sense of Reinforcement Learning and Probabilistic Inference	Jan 3, 2020	reinforcement-learningReinforcement Learning	—Unverified
Markov Decision Process modeled with Bandits for Sequential Decision Making in Linear-flow	Jul 1, 2021	Decision MakingMarketing	—Unverified
Optimization-Driven Adaptive Experimentation	Aug 8, 2024	GPUThompson Sampling	—Unverified
Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents	Jun 18, 2024	continuous-controlContinuous Control	—Unverified

Show:10 25 50

← PrevPage 42 of 66Next →

No leaderboard results yet.