Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making Jan 20, 2022 counterfactual Decision Making
— Unverified 0On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation Jan 17, 2022 Off-policy evaluation
— Unverified 0Off-Policy Evaluation Using Information Borrowing and Context-Based Switching Dec 18, 2021 Multi-Armed Bandits Off-policy evaluation
Code Code Available 0Optimal discharge of patients from intensive care via a data-driven policy learning framework Dec 17, 2021 Management Off-policy evaluation
— Unverified 0Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning Dec 1, 2021 Multi-Armed Bandits Off-policy evaluation
Code Code Available 0Weighted model estimation for offline model-based reinforcement learning Dec 1, 2021 Density Ratio Estimation model
— Unverified 0Loss Functions for Discrete Contextual Pricing with Observational Data Nov 18, 2021 Management Off-policy evaluation
— Unverified 0A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes Nov 12, 2021 Off-policy evaluation
Code Code Available 0SOPE: Spectrum of Off-Policy Estimators Nov 6, 2021 Decision Making Off-policy evaluation
Code Code Available 0Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes Oct 28, 2021 Causal Inference Management
Code Code Available 0Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning Oct 26, 2021 Off-policy evaluation Open-Ended Question Answering
Code Code Available 0Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability Oct 24, 2021 Off-policy evaluation
— Unverified 0Stateful Offline Contextual Policy Evaluation and Learning Oct 19, 2021 Management Multi-Armed Bandits
— Unverified 0Why Should I Trust You, Bellman? Evaluating the Bellman Objective with Off-Policy Data Sep 29, 2021 Deep Reinforcement Learning Off-policy evaluation
— Unverified 0A Spectral Approach to Off-Policy Evaluation for POMDPs Sep 22, 2021 Causal Identification Off-policy evaluation
— Unverified 0Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation Sep 17, 2021 Decision Making Offline RL
— Unverified 0Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service Sep 17, 2021 Decision Making Marketing
— Unverified 0State Relevance for Off-Policy Evaluation Sep 13, 2021 Off-policy evaluation
Code Code Available 0Debiasing Samples from Online Learning Using Bootstrap Jul 31, 2021 Off-policy evaluation Thompson Sampling
— Unverified 0Online Learning for Recommendations at Grubhub Jul 15, 2021 Incremental Learning Off-policy evaluation
— Unverified 0A Unified Off-Policy Evaluation Approach for General Value Function Jul 6, 2021 Anomaly Detection Off-policy evaluation
— Unverified 0Supervised Off-Policy Ranking Jul 3, 2021 Off-policy evaluation
Code Code Available 0Variance-Aware Off-Policy Evaluation with Linear Function Approximation Jun 22, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Control Variates for Slate Off-Policy Evaluation Jun 15, 2021 Off-policy evaluation Recommendation Systems
Code Code Available 0Robust Generalization despite Distribution Shift via Minimum Discriminating Information Jun 8, 2021 Generalization Bounds Off-policy evaluation
Code Code Available 0Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation Jun 7, 2021 Off-policy evaluation
Code Code Available 0Deeply-Debiased Off-Policy Interval Estimation May 10, 2021 Off-policy evaluation
Code Code Available 0Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization Apr 28, 2021 continuous-control Continuous Control
— Unverified 0Universal Off-Policy Evaluation Apr 26, 2021 counterfactual Decision Making
Code Code Available 0Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning Apr 20, 2021 Clustering Decision Making
— Unverified 0Off-Policy Risk Assessment in Contextual Bandits Apr 18, 2021 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm Mar 17, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds Mar 9, 2021 Off-policy evaluation Open-Ended Question Answering
— Unverified 0Minimax Model Learning Mar 2, 2021 model Model-based Reinforcement Learning
— Unverified 0Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach Feb 20, 2021 Model-based Reinforcement Learning Off-policy evaluation
Code Code Available 0Off-policy Confidence Sequences Feb 18, 2021 Off-policy evaluation valid
— Unverified 0Bootstrapping Fitted Q-Evaluation for Off-Policy Inference Feb 6, 2021 Off-policy evaluation
— Unverified 0Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency Feb 5, 2021 Off-policy evaluation reinforcement-learning
— Unverified 0Minimax Off-Policy Evaluation for Multi-Armed Bandits Jan 19, 2021 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint Jan 6, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Off-Policy Evaluation of Slate Policies under Bayes Risk Jan 5, 2021 Off-policy evaluation
— Unverified 0Practical Marginalized Importance Sampling with the Successor Representation Jan 1, 2021 Deep Reinforcement Learning MuJoCo
— Unverified 0Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies Nov 29, 2020 Off-policy evaluation Recommendation Systems
— Unverified 0Reliable Off-policy Evaluation for Reinforcement Learning Nov 8, 2020 Decision Making Off-policy evaluation
— Unverified 0Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity Nov 5, 2020 Diversity Off-policy evaluation
Code Code Available 0Off-Policy Interval Estimation with Lipschitz Value Iteration Oct 29, 2020 Decision Making Medical Diagnosis
— Unverified 0Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings Oct 29, 2020 Change Point Detection Off-policy evaluation
Code Code Available 0A Practical Guide of Off-Policy Evaluation for Bandit Problems Oct 23, 2020 Off-policy evaluation
— Unverified 0Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy Oct 23, 2020 Off-policy evaluation
— Unverified 0CoinDICE: Off-Policy Confidence Interval Estimation Oct 22, 2020 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0