SOTAVerified

Thompson Sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Papers

Showing 401425 of 655 papers

TitleStatusHype
Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect0
Sample Efficient Learning of Factored Embeddings of Tensor Fields0
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration0
Learning to Optimize Via Posterior Sampling0
Learning to Price with Reference Effects0
Learning to Rank in the Position Based Model with Bandit Feedback0
Learning Unknown Markov Decision Processes: A Thompson Sampling Approach0
Lenient Regret for Multi-Armed Bandits0
Leveraging Demonstrations to Improve Online Learning: Quality Matters0
Leveraging Offline Data from Similar Systems for Online Linear Quadratic Control0
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits0
Linear Bandit algorithms using the Bootstrap0
Linear Thompson Sampling Revisited0
Little Exploration is All You Need0
Maillard Sampling: Boltzmann Exploration Done Optimally0
Making RL with Preference-based Feedback Efficient via Randomization0
Making Sense of Reinforcement Learning and Probabilistic Inference0
Markov Decision Process modeled with Bandits for Sequential Decision Making in Linear-flow0
Optimization-Driven Adaptive Experimentation0
Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents0
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models0
Meta Dynamic Pricing: Transfer Learning Across Experiments0
Meta Learning in Bandits within Shared Affine Subspaces0
Metalearning Linear Bandits by Prior Update0
Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.