SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 161–170 of 655 papers

Title	Date	Tasks	Status
Monte-Carlo tree search with uncertainty propagation via optimal transport	Sep 19, 2023	Thompson Sampling	—Unverified
Task Selection and Assignment for Multi-modal Multi-task Dialogue Act Classification with Non-stationary Multi-armed Bandits	Sep 18, 2023	Dialogue Act ClassificationMulti-Armed Bandits	—Unverified
gym-saturation: Gymnasium environments for saturation provers (System description)	Sep 16, 2023	OpenAI Gymreinforcement-learning	—Unverified
Generalized Regret Analysis of Thompson Sampling using Fractional Posteriors	Sep 12, 2023	Thompson Sampling	—Unverified
Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages	Aug 28, 2023	Decision MakingDecision Making Under Uncertainty	CodeCode Available
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach	Aug 21, 2023	Decision MakingMulti-Armed Bandits	CodeCode Available
Thompson Sampling for Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit	Aug 20, 2023	Thompson Sampling	—Unverified
AdaptEx: A Self-Service Contextual Bandit Platform	Aug 8, 2023	Multi-Armed BanditsThompson Sampling	—Unverified
Bag of Policies for Distributional Deep Exploration	Aug 3, 2023	Atari GamesEfficient Exploration	—Unverified
VITS : Variational Inference Thompson Sampling for contextual bandits	Jul 19, 2023	Multi-Armed BanditsThompson Sampling	CodeCode Available

Show:10 25 50

← PrevPage 17 of 66Next →

No leaderboard results yet.