Stacked Thompson Bandits Feb 28, 2017 Thompson Sampling
Code Code Available 0Thompson Sampling For Stochastic Bandits with Graph Feedback Jan 16, 2017 Thompson Sampling
— Unverified 0Estimating Quality in Multi-Objective Bandits Optimization Jan 4, 2017 Thompson Sampling
— Unverified 0Exploration for Multi-task Reinforcement Learning with Deep Generative Models Nov 29, 2016 reinforcement-learning Reinforcement Learning
— Unverified 0Nonparametric General Reinforcement Learning Nov 28, 2016 General Reinforcement Learning reinforcement-learning
— Unverified 0Linear Thompson Sampling Revisited Nov 20, 2016 Thompson Sampling
— Unverified 0Unimodal Thompson Sampling for Graph-Structured Arms Nov 17, 2016 Thompson Sampling
— Unverified 0The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits Oct 14, 2016 reinforcement-learning Reinforcement Learning
— Unverified 0A Formal Solution to the Grain of Truth Problem Sep 16, 2016 Thompson Sampling
— Unverified 0BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems Aug 17, 2016 Deep Reinforcement Learning Efficient Exploration
— Unverified 0Human collective intelligence as distributed Bayesian inference Aug 5, 2016 Bayesian Inference Decision Making
— Unverified 0Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits Jun 30, 2016 Thompson Sampling
— Unverified 0Online Algorithms For Parameter Mean And Variance Estimation In Dynamic Regression Models May 18, 2016 parameter estimation regression
— Unverified 0Linear Bandit algorithms using the Bootstrap May 4, 2016 Thompson Sampling
— Unverified 0Double Thompson Sampling for Dueling Bandits Apr 25, 2016 Thompson Sampling
Code Code Available 0An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization Apr 12, 2016 Recommendation Systems Thompson Sampling
— Unverified 0A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization Apr 1, 2016 Bayesian Optimization Thompson Sampling
— Unverified 0Optimal Recommendation to Users that React: Online Learning for a Class of POMDPs Mar 30, 2016 Recommendation Systems Reinforcement Learning
— Unverified 0Cascading Bandits for Large-Scale Recommendation Problems Mar 17, 2016 Multi-Armed Bandits Recommendation Systems
Code Code Available 0Simple Bayesian Algorithms for Best Arm Identification Feb 26, 2016 Thompson Sampling
— Unverified 0Thompson Sampling is Asymptotically Optimal in General Environments Feb 25, 2016 reinforcement-learning Reinforcement Learning
— Unverified 0Convolutional Monte Carlo Rollouts in Go Dec 10, 2015 GPU Thompson Sampling
— Unverified 0Efficient Thompson Sampling for Online Matrix-Factorization Recommendation Dec 1, 2015 Collaborative Filtering Recommendation Systems
— Unverified 0Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits Nov 18, 2015 Multi-Armed Bandits Thompson Sampling
— Unverified 0TSEB: More Efficient Thompson Sampling for Policy Learning Oct 10, 2015 Thompson Sampling
— Unverified 0Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models Jul 3, 2015 Atari Games reinforcement-learning
Code Code Available 0Bootstrapped Thompson Sampling and Deep Exploration Jul 1, 2015 reinforcement-learning Reinforcement Learning
— Unverified 0On the Prior Sensitivity of Thompson Sampling Jun 10, 2015 Sensitivity Thompson Sampling
— Unverified 0Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays Jun 2, 2015 Thompson Sampling
Code Code Available 0Belief Flows of Robust Online Learning May 26, 2015 General Classification regression
— Unverified 0Thompson Sampling for Budgeted Multi-armed Bandits May 1, 2015 Multi-Armed Bandits Thompson Sampling
— Unverified 0Evaluation of Explore-Exploit Policies in Multi-result Ranking Systems Apr 28, 2015 News Recommendation Thompson Sampling
— Unverified 0A Note on Information-Directed Sampling and Thompson Sampling Mar 24, 2015 Thompson Sampling
— Unverified 0Bandit Convex Optimization: sqrtT Regret in One Dimension Feb 23, 2015 Thompson Sampling
— Unverified 0Thompson sampling with the online bootstrap Oct 15, 2014 Thompson Sampling
— Unverified 0Freshness-Aware Thompson Sampling Sep 29, 2014 Recommendation Systems Thompson Sampling
— Unverified 0Towards Optimal Algorithms for Prediction with Expert Advice Sep 10, 2014 Prediction Thompson Sampling
— Unverified 0Thompson Sampling for Learning Parameterized Markov Decision Processes Jun 29, 2014 Form reinforcement-learning
— Unverified 0Efficient Learning in Large-Scale Combinatorial Semi-Bandits Jun 28, 2014 Thompson Sampling
— Unverified 0An Information-Theoretic Analysis of Thompson Sampling Mar 21, 2014 Thompson Sampling
— Unverified 0Better Optimism By Bayes: Adaptive Planning with Rich Models Feb 9, 2014 Model-based Reinforcement Learning Reinforcement Learning
— Unverified 0Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search Dec 1, 2013 Thompson Sampling
— Unverified 0Eluder Dimension and the Sample Complexity of Optimistic Exploration Dec 1, 2013 Thompson Sampling
— Unverified 0Thompson Sampling for Complex Bandit Problems Nov 3, 2013 Thompson Sampling
— Unverified 0Thompson Sampling for Online Learning with Linear Experts Nov 3, 2013 Thompson Sampling
— Unverified 0Generalized Thompson Sampling for Contextual Bandits Oct 27, 2013 Multi-Armed Bandits Thompson Sampling
— Unverified 0Thompson Sampling in Dynamic Systems for Contextual Bandit Problems Oct 17, 2013 Thompson Sampling
— Unverified 0Thompson Sampling for 1-Dimensional Exponential Family Bandits Jul 12, 2013 Thompson Sampling
— Unverified 0Cover Tree Bayesian Reinforcement Learning May 8, 2013 reinforcement-learning Reinforcement Learning
— Unverified 0Prior-free and prior-dependent regret bounds for Thompson Sampling Apr 21, 2013 Thompson Sampling
— Unverified 0