Policy Gradient Optimization of Thompson Sampling Policies Jun 30, 2020 Policy Gradient Methods Thompson Sampling
— Unverified 0Position-Based Multiple-Play Bandits with Thompson Sampling Sep 28, 2020 Position Recommendation Systems
— Unverified 0Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds Nov 7, 2023 Bayesian Optimization Thompson Sampling
— Unverified 0Posterior sampling for reinforcement learning: worst-case regret bounds May 19, 2017 reinforcement-learning Reinforcement Learning
— Unverified 0Posterior Sampling via Autoregressive Generation May 29, 2024 Articles Decision Making
— Unverified 0Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection May 28, 2025 Thompson Sampling
— Unverified 0Preferential Multi-Objective Bayesian Optimization Jun 20, 2024 Autonomous Driving Bayesian Optimization
— Unverified 0Prior-free and prior-dependent regret bounds for Thompson Sampling Apr 21, 2013 Thompson Sampling
— Unverified 0Probabilistic Inference in Reinforcement Learning Done Right Nov 22, 2023 reinforcement-learning Reinforcement Learning
— Unverified 0Profitable Bandits May 8, 2018 Management Thompson Sampling
— Unverified 0QoS-Aware Multi-Armed Bandits Feb 28, 2017 Decision Making Multi-Armed Bandits
— Unverified 0Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors Aug 16, 2017 Thompson Sampling
— Unverified 0Random Effect Bandits Jun 23, 2021 Multi-Armed Bandits Thompson Sampling
— Unverified 0Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization Jun 8, 2020 Bayesian Optimization Thompson Sampling
— Unverified 0Randomised Bayesian Least-Squares Policy Iteration Apr 6, 2019 Thompson Sampling
— Unverified 0Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning Apr 16, 2024 Federated Learning Multi-agent Reinforcement Learning
— Unverified 0Regenerative Particle Thompson Sampling Mar 15, 2022 Thompson Sampling
— Unverified 0Regret Analysis of Bandit Problems with Causal Background Knowledge Oct 11, 2019 Thompson Sampling
— Unverified 0Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits Nov 18, 2015 Multi-Armed Bandits Thompson Sampling
— Unverified 0Regret Bounds for Information-Directed Reinforcement Learning Jun 9, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0Regularized-OFU: an efficient algorithm for general contextual bandit with optimization oracles Sep 29, 2021 Multi-Armed Bandits Thompson Sampling
— Unverified 0Reinforcement Learning for Efficient and Tuning-Free Link Adaptation Oct 16, 2020 reinforcement-learning Reinforcement Learning
— Unverified 0Reinforcement learning techniques for Outer Loop Link Adaptation in 4G/5G systems Aug 3, 2017 Multi-Armed Bandits reinforcement-learning
— Unverified 0Reinforcement Learning with Subspaces using Free Energy Paradigm Dec 13, 2020 reinforcement-learning Reinforcement Learning
— Unverified 0Reinforcement Learning with Trajectory Feedback Aug 13, 2020 reinforcement-learning Reinforcement Learning
— Unverified 0Remote Contextual Bandits Feb 10, 2022 Marketing Multi-Armed Bandits
— Unverified 0Residual Bootstrap Exploration for Bandit Algorithms Feb 19, 2020 Computational Efficiency Multi-Armed Bandits
— Unverified 0Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning Jun 21, 2019 Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Reward Biased Maximum Likelihood Estimation for Reinforcement Learning Nov 16, 2020 Multi-Armed Bandits reinforcement-learning
— Unverified 0Risk and optimal policies in bandit experiments Dec 13, 2021 Dimensionality Reduction Thompson Sampling
— Unverified 0Risk-averse Contextual Multi-armed Bandit Problem with Linear Payoffs Jun 24, 2022 Thompson Sampling
— Unverified 0Risk-Constrained Thompson Sampling for CVaR Bandits Nov 16, 2020 Decision Making Thompson Sampling
— Unverified 0Robust Dynamic Assortment Optimization in the Presence of Outlier Customers Oct 9, 2019 Assortment Optimization Thompson Sampling
— Unverified 0Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments Jun 26, 2025 Reinforcement Learning (RL) Thompson Sampling
— Unverified 0Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks Oct 25, 2024 Decision Making Sequential Decision Making
— Unverified 0Safe Linear Leveling Bandits Dec 13, 2021 Multi-Armed Bandits Thompson Sampling
— Unverified 0Safe Linear Thompson Sampling with Side Information Nov 6, 2019 Thompson Sampling
— Unverified 0Sample-based Dynamic Hierarchical Transformer with Layer and Head Flexibility via Contextual Bandit Dec 5, 2023 Thompson Sampling
— Unverified 0The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity Feb 3, 2020 Multi-Armed Bandits Thompson Sampling
— Unverified 0Sampling Acquisition Functions for Batch Bayesian Optimization Mar 22, 2019 Bayesian Optimization Thompson Sampling
— Unverified 0Satisficing in Time-Sensitive Bandit Learning Mar 7, 2018 Thompson Sampling
— Unverified 0Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype May 22, 2025 Feature Engineering Large Language Model
— Unverified 0Scalable Generalized Linear Bandits: Online Computation and Hashing Jun 1, 2017 Thompson Sampling
— Unverified 0Scalable Neural Contextual Bandit for Recommender Systems Jun 26, 2023 Recommendation Systems Thompson Sampling
— Unverified 0Scalable regret for learning to control network-coupled subsystems with unknown dynamics Aug 18, 2021 Thompson Sampling
— Unverified 0Scalable Thompson Sampling using Sparse Gaussian Process Models Jun 9, 2020 Thompson Sampling
— Unverified 0Scalable Thompson Sampling via Optimal Transport Feb 19, 2019 Decision Making Sequential Decision Making
— Unverified 0Scaling Multi-Armed Bandit Algorithms Jul 25, 2019 Multi-Armed Bandits Sequential Decision Making
— Unverified 0Screening for an Infectious Disease as a Problem in Stochastic Control Nov 1, 2020 Thompson Sampling
— Unverified 0Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization May 17, 2022 Multi-Armed Bandits Thompson Sampling
— Unverified 0