Efficient Exploration through Bayesian Deep Q-Networks Feb 13, 2018 Atari Games Efficient Exploration
Code Code Available 0Thompson Sampling for Dynamic Pricing Feb 8, 2018 Active Learning Thompson Sampling
— Unverified 0Information Directed Sampling and Bandits with Heteroscedastic Noise Jan 29, 2018 Bayesian Optimization Thompson Sampling
— Unverified 0Active Search for High Recall: a Non-Stationary Extension of Thompson Sampling Dec 27, 2017 Multi-Armed Bandits Thompson Sampling
— Unverified 0On Adaptive Estimation for Dynamic Bernoulli Bandits Dec 8, 2017 Thompson Sampling
— Unverified 0Optimistic posterior sampling for reinforcement learning: worst-case regret bounds Dec 1, 2017 reinforcement-learning Reinforcement Learning
— Unverified 0Efficient exploration with Double Uncertain Value Networks Nov 29, 2017 Efficient Exploration Reinforcement Learning
— Unverified 0Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models Nov 22, 2017 Multi-Armed Bandits Response Generation
— Unverified 0Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies Nov 16, 2017 Decision Making Thompson Sampling
— Unverified 0BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems Nov 15, 2017 Deep Reinforcement Learning Efficient Exploration
— Unverified 0Estimating prediction error for complex samples Nov 13, 2017 Prediction Survey
— Unverified 0Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates Nov 9, 2017 Thompson Sampling
— Unverified 0Information Directed Sampling for Stochastic Bandits with Graph Feedback Nov 8, 2017 Decision Making Thompson Sampling
— Unverified 0The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems Nov 5, 2017 Thompson Sampling
— Unverified 0Generalized Probabilistic Bisection for Stochastic Root-Finding Nov 2, 2017 Thompson Sampling
— Unverified 0Minimal Exploration in Structured Stochastic Bandits Nov 1, 2017 Thompson Sampling
— Unverified 0Sequential Matrix Completion Oct 23, 2017 Collaborative Filtering Matrix Completion
— Unverified 0A study of Thompson Sampling with Parameter h Oct 5, 2017 Thompson Sampling
— Unverified 0Learning Unknown Markov Decision Processes: A Thompson Sampling Approach Sep 14, 2017 Reinforcement Learning Thompson Sampling
— Unverified 0Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits Sep 12, 2017 Thompson Sampling
— Unverified 0Bayesian bandits: balancing the exploration-exploitation tradeoff via double sampling Sep 10, 2017 Reinforcement Learning Thompson Sampling
Code Code Available 0Variational inference for the multi-armed contextual bandit Sep 10, 2017 Multi-Armed Bandits Reinforcement Learning
Code Code Available 0Learning to Price with Reference Effects Aug 29, 2017 Reinforcement Learning Thompson Sampling
— Unverified 0Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors Aug 16, 2017 Thompson Sampling
— Unverified 0Thompson Sampling Guided Stochastic Searching on the Line for Deceptive Environments with Applications to Root-Finding Problems Aug 5, 2017 Stochastic Optimization Thompson Sampling
— Unverified 0Reinforcement learning techniques for Outer Loop Link Adaptation in 4G/5G systems Aug 3, 2017 Multi-Armed Bandits reinforcement-learning
— Unverified 0Streaming kernel regression with provably adaptive mean, variance, and regularization Aug 2, 2017 regression Thompson Sampling
— Unverified 0Counterfactual Data-Fusion for Online Reinforcement Learners Aug 1, 2017 counterfactual Decision Making
— Unverified 0Taming Non-stationary Bandits: A Bayesian Approach Jul 31, 2017 Thompson Sampling
— Unverified 0Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms: A Case with Bounded Regret Jul 24, 2017 Movie Recommendation Thompson Sampling
— Unverified 0Calibrated Fairness in Bandits Jul 6, 2017 Decision Making Fairness
— Unverified 0A Practical Method for Solving Contextual Bandit Problems Using Decision Trees Jun 14, 2017 Thompson Sampling
— Unverified 0Bandit Models of Human Behavior: Reward Processing in Mental Disorders Jun 7, 2017 Decision Making Thompson Sampling
— Unverified 0Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space Jun 6, 2017 Bayesian Optimization Thompson Sampling
— Unverified 0Thompson Sampling for the MNL-Bandit Jun 3, 2017 Thompson Sampling
— Unverified 0Scalable Generalized Linear Bandits: Online Computation and Hashing Jun 1, 2017 Thompson Sampling
— Unverified 0Asynchronous Parallel Bayesian Optimisation via Thompson Sampling May 25, 2017 Bayesian Optimisation Thompson Sampling
Code Code Available 0A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data May 23, 2017 Thompson Sampling
— Unverified 0AIXIjs: A Software Demo for General Reinforcement Learning May 22, 2017 General Reinforcement Learning OpenAI Gym
Code Code Available 0Ensemble Sampling May 20, 2017 Thompson Sampling
— Unverified 0Posterior sampling for reinforcement learning: worst-case regret bounds May 19, 2017 reinforcement-learning Reinforcement Learning
— Unverified 0Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization May 18, 2017 global-optimization Thompson Sampling
— Unverified 0Context Attentive Bandits: Contextual Bandit with Restricted Context May 10, 2017 Recommendation Systems Thompson Sampling
— Unverified 0Multi-dueling Bandits with Dependent Arms Apr 29, 2017 Thompson Sampling
— Unverified 0Mostly Exploration-Free Algorithms for Contextual Bandits Apr 28, 2017 Diversity Multi-Armed Bandits
Code Code Available 0Time-Sensitive Bandit Learning and Satisficing Thompson Sampling Apr 28, 2017 Thompson Sampling
— Unverified 0Efficient Benchmarking of NLP APIs using Multi-armed Bandits Apr 1, 2017 Benchmarking Multi-Armed Bandits
— Unverified 0Thompson Sampling for Linear-Quadratic Control Problems Mar 27, 2017 Reinforcement Learning Thompson Sampling
— Unverified 0Horde of Bandits using Gaussian Markov Random Fields Mar 7, 2017 Clustering Multi-Armed Bandits
— Unverified 0QoS-Aware Multi-Armed Bandits Feb 28, 2017 Decision Making Multi-Armed Bandits
— Unverified 0