SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 655 papers

Title	Date	Tasks	Status
Active RLHF via Best Policy Learning from Trajectory Preference Feedback	Jan 31, 2025	Thompson Sampling	—Unverified
FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling	Jan 31, 2025	Federated LearningThompson Sampling	CodeCode Available
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning	Jan 16, 2025	Model-based Reinforcement Learningreinforcement-learning	—Unverified
Truthful mechanisms for linear bandit games with private contexts	Jan 7, 2025	Thompson Sampling	—Unverified
Stochastically Constrained Best Arm Identification with Thompson Sampling	Jan 7, 2025	Thompson Sampling	—Unverified
WAPTS: A Weighted Allocation Probability Adjusted Thompson Sampling Algorithm for High-Dimensional and Sparse Experiment Settings	Jan 7, 2025	Thompson Sampling	—Unverified
On Improved Regret Bounds In Bayesian Optimization with Gaussian Noise	Dec 25, 2024	Bayesian OptimizationThompson Sampling	—Unverified
Generalized Bayesian deep reinforcement learning	Dec 16, 2024	Deep Reinforcement Learningreinforcement-learning	—Unverified
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits	Dec 3, 2024	Thompson Sampling	—Unverified
BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings	Nov 30, 2024	Bayesian OptimizationPolicy Gradient Methods	—Unverified

Show:10 25 50

← PrevPage 7 of 66Next →

No leaderboard results yet.