Diffusion Models Meet Contextual Bandits with Large Action Spaces Feb 15, 2024 Efficient Exploration Multi-Armed Bandits
— Unverified 00 DISCO: An End-to-End Bandit Framework for Personalised Discount Allocation Jun 10, 2024 Thompson Sampling
— Unverified 00 Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals Dec 16, 2020 Thompson Sampling
— Unverified 00 Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning Nov 29, 2020 Action Generation Decision Making
— Unverified 00 Debiasing Samples from Online Learning Using Bootstrap Jul 31, 2021 Off-policy evaluation Thompson Sampling
— Unverified 00 Asymptotic Convergence of Thompson Sampling Nov 8, 2020 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Diversified Sampling for Batched Bayesian Optimization with Determinantal Point Processes Oct 22, 2021 Bayesian Optimization Diversity
— Unverified 00 Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits Sep 15, 2022 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Double-Linear Thompson Sampling for Context-Attentive Bandits Oct 15, 2020 Medical Diagnosis Thompson Sampling
— Unverified 00 Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models Nov 22, 2017 Multi-Armed Bandits Response Generation
— Unverified 00 Cover Tree Bayesian Reinforcement Learning May 8, 2013 reinforcement-learning Reinforcement Learning
— Unverified 00 Double Thompson Sampling in Finite stochastic Games Feb 21, 2022 Thompson Sampling
— Unverified 00 Online Multi-Armed Bandits with Adaptive Inference Feb 25, 2021 Causal Inference Decision Making
— Unverified 00 Doubly robust Thompson sampling for linear payoffs Feb 1, 2021 Thompson Sampling
— Unverified 00 Doubly Robust Thompson Sampling with Linear Payoffs Dec 1, 2021 Thompson Sampling
— Unverified 00 DRL-based Joint Resource Scheduling of eMBB and URLLC in O-RAN Jul 16, 2024 Decision Making Deep Reinforcement Learning
— Unverified 00 Dual-Directed Algorithm Design for Efficient Pure Exploration Oct 30, 2023 Thompson Sampling
— Unverified 00 The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models Feb 28, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Dynamic collaborative filtering Thompson Sampling for cross-domain advertisements recommendation Aug 25, 2022 Collaborative Filtering Recommendation Systems
— Unverified 00 Dynamic Decision-Making under Model Misspecification May 20, 2025 Decision Making model
— Unverified 00 A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms Mar 10, 2023 Thompson Sampling
— Unverified 00 Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits Feb 23, 2024 Thompson Sampling
— Unverified 00 Effects of Model Misspecification on Bayesian Bandits: Case Studies in UX Optimization Oct 7, 2020 Thompson Sampling
— Unverified 00 Efficient and Adaptive Posterior Sampling Algorithms for Bandits May 2, 2024 Thompson Sampling
— Unverified 00 Efficient Benchmarking of NLP APIs using Multi-armed Bandits Apr 1, 2017 Benchmarking Multi-Armed Bandits
— Unverified 00 Efficient Exploration for LLMs Feb 1, 2024 Efficient Exploration Thompson Sampling
— Unverified 00 Efficient exploration of zero-sum stochastic games Feb 24, 2020 Efficient Exploration Thompson Sampling
— Unverified 00 Counterfactual Inference under Thompson Sampling Apr 3, 2025 Causal Inference counterfactual
— Unverified 00 Efficient exploration with Double Uncertain Value Networks Nov 29, 2017 Efficient Exploration Reinforcement Learning
— Unverified 00 Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling Oct 30, 2021 Thompson Sampling
— Unverified 00 Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget Jun 3, 2025 Thompson Sampling
— Unverified 00 Efficient Learning in Large-Scale Combinatorial Semi-Bandits Jun 28, 2014 Thompson Sampling
— Unverified 00 Counterfactual Data-Fusion for Online Reinforcement Learners Aug 1, 2017 counterfactual Decision Making
— Unverified 00 Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling Oct 7, 2024 continuous-control Continuous Control
— Unverified 00 Efficient Multivariate Bandit Algorithm with Path Planning Sep 6, 2019 Heuristic Search Thompson Sampling
— Unverified 00 Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling Aug 24, 2020 Deep Reinforcement Learning Thompson Sampling
— Unverified 00 Asymptotically Optimal Bandits under Weighted Information May 28, 2021 Thompson Sampling
— Unverified 00 Efficient Thompson Sampling for Online Matrix-Factorization Recommendation Dec 1, 2015 Collaborative Filtering Recommendation Systems
— Unverified 00 A General Theory of the Stochastic Linear Bandit and Its Applications Feb 12, 2020 Multi-Armed Bandits Thompson Sampling
— Unverified 00 Eluder Dimension and the Sample Complexity of Optimistic Exploration Dec 1, 2013 Thompson Sampling
— Unverified 00 ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment Mar 11, 2024 Multi-Armed Bandits Reinforcement Learning (RL)
— Unverified 00 Ensemble Sampling May 20, 2017 Thompson Sampling
— Unverified 00 Cost-efficient Knowledge-based Question Answering with Large Language Models May 27, 2024 Knowledge Graphs Model Selection
— Unverified 00 Epsilon-Greedy Thompson Sampling to Bayesian Optimization Mar 1, 2024 Bayesian Optimization Cantilever Beam
— Unverified 00 Cost Aware Asynchronous Multi-Agent Active Search Oct 5, 2022 Decision Making Thompson Sampling
— Unverified 00 Estimating prediction error for complex samples Nov 13, 2017 Prediction Survey
— Unverified 00 Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits Jun 30, 2016 Thompson Sampling
— Unverified 00 Etat de l'art sur l'application des bandits multi-bras Jan 4, 2021 Thompson Sampling
— Unverified 00 EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning Jan 16, 2025 Model-based Reinforcement Learning reinforcement-learning
— Unverified 00 Convolutional Monte Carlo Rollouts in Go Dec 10, 2015 GPU Thompson Sampling
— Unverified 00