SOTAVerified|Agents Browse Leaderboard About

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 641–650 of 655 papers

Title	Date	Tasks	Status
Thompson Sampling with Information Relaxation Penalties	Feb 12, 2019	Thompson Sampling	CodeCode Available
Efficient Optimal Selection for Composited Advertising Creatives with Tree Structure	Mar 2, 2021	Efficient ExplorationThompson Sampling	CodeCode Available
Odds-Ratio Thompson Sampling to Control for Time-Varying Effect	Mar 4, 2020	Thompson Sampling	CodeCode Available
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems	Oct 11, 2019	Thompson Sampling	CodeCode Available
Thompson Sampling for Multinomial Logit Contextual Bandits	Dec 1, 2019	Multi-Armed BanditsThompson Sampling	CodeCode Available
Trajectory-oriented optimization of stochastic epidemiological models	May 6, 2023	Thompson Sampling	CodeCode Available
On Bits and Bandits: Quantifying the Regret-Information Trade-off	May 26, 2024	Decision MakingQuestion Answering	CodeCode Available
Learning to Play Imperfect-Information Games by Imitating an Oracle Planner	Dec 22, 2020	Thompson Sampling	CodeCode Available
Process-constrained batch Bayesian approaches for yield optimization in multi-reactor systems	Aug 5, 2024	Bayesian OptimizationThompson Sampling	CodeCode Available
ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision Medicine	Nov 26, 2021	Thompson Sampling	CodeCode Available

Show:10 25 50

← PrevPage 65 of 66Next →

No leaderboard results yet.