Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability Oct 24, 2021 Off-policy evaluation
— Unverified 0Stateful Offline Contextual Policy Evaluation and Learning Oct 19, 2021 Management Multi-Armed Bandits
— Unverified 0Why Should I Trust You, Bellman? Evaluating the Bellman Objective with Off-Policy Data Sep 29, 2021 Deep Reinforcement Learning Off-policy evaluation
— Unverified 0A Spectral Approach to Off-Policy Evaluation for POMDPs Sep 22, 2021 Causal Identification Off-policy evaluation
— Unverified 0Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service Sep 17, 2021 Decision Making Marketing
— Unverified 0Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation Sep 17, 2021 Decision Making Offline RL
— Unverified 0State Relevance for Off-Policy Evaluation Sep 13, 2021 Off-policy evaluation
Code Code Available 0Evaluating the Robustness of Off-Policy Evaluation Aug 31, 2021 Off-policy evaluation Recommendation Systems
Code Code Available 1Debiasing Samples from Online Learning Using Bootstrap Jul 31, 2021 Off-policy evaluation Thompson Sampling
— Unverified 0Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings Jul 23, 2021 Computational Efficiency Decision Making
Code Code Available 1Online Learning for Recommendations at Grubhub Jul 15, 2021 Incremental Learning Off-policy evaluation
— Unverified 0A Unified Off-Policy Evaluation Approach for General Value Function Jul 6, 2021 Anomaly Detection Off-policy evaluation
— Unverified 0Supervised Off-Policy Ranking Jul 3, 2021 Off-policy evaluation
Code Code Available 0Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation Jun 24, 2021 Meta Reinforcement Learning Off-policy evaluation
Code Code Available 1Variance-Aware Off-Policy Evaluation with Linear Function Approximation Jun 22, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Active Offline Policy Selection Jun 18, 2021 Bayesian Optimization Off-policy evaluation
Code Code Available 1Offline RL Without Off-Policy Evaluation Jun 16, 2021 D4RL Offline RL
Code Code Available 1Control Variates for Slate Off-Policy Evaluation Jun 15, 2021 Off-policy evaluation Recommendation Systems
Code Code Available 0A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation Jun 12, 2021 Deep Reinforcement Learning MuJoCo
Code Code Available 1Robust Generalization despite Distribution Shift via Minimum Discriminating Information Jun 8, 2021 Generalization Bounds Off-policy evaluation
Code Code Available 0Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation Jun 7, 2021 Off-policy evaluation
Code Code Available 0Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits Jun 3, 2021 Multi-Armed Bandits Off-policy evaluation
Code Code Available 1Deeply-Debiased Off-Policy Interval Estimation May 10, 2021 Off-policy evaluation
Code Code Available 0Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization Apr 28, 2021 continuous-control Continuous Control
— Unverified 0Universal Off-Policy Evaluation Apr 26, 2021 counterfactual Decision Making
Code Code Available 0Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning Apr 20, 2021 Clustering Decision Making
— Unverified 0Off-Policy Risk Assessment in Contextual Bandits Apr 18, 2021 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Benchmarks for Deep Off-Policy Evaluation Mar 30, 2021 Benchmarking continuous-control
Code Code Available 1Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm Mar 17, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds Mar 9, 2021 Off-policy evaluation Open-Ended Question Answering
— Unverified 0Minimax Model Learning Mar 2, 2021 model Model-based Reinforcement Learning
— Unverified 0Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach Feb 20, 2021 Model-based Reinforcement Learning Off-policy evaluation
Code Code Available 0Off-policy Confidence Sequences Feb 18, 2021 Off-policy evaluation valid
— Unverified 0Bootstrapping Fitted Q-Evaluation for Off-Policy Inference Feb 6, 2021 Off-policy evaluation
— Unverified 0Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency Feb 5, 2021 Off-policy evaluation reinforcement-learning
— Unverified 0Minimax Off-Policy Evaluation for Multi-Armed Bandits Jan 19, 2021 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint Jan 6, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Off-Policy Evaluation of Slate Policies under Bayes Risk Jan 5, 2021 Off-policy evaluation
— Unverified 0Practical Marginalized Importance Sampling with the Successor Representation Jan 1, 2021 Deep Reinforcement Learning MuJoCo
— Unverified 0Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies Nov 29, 2020 Off-policy evaluation Recommendation Systems
— Unverified 0Reliable Off-policy Evaluation for Reinforcement Learning Nov 8, 2020 Decision Making Off-policy evaluation
— Unverified 0Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity Nov 5, 2020 Diversity Off-policy evaluation
Code Code Available 0Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings Oct 29, 2020 Change Point Detection Off-policy evaluation
Code Code Available 0Off-Policy Interval Estimation with Lipschitz Value Iteration Oct 29, 2020 Decision Making Medical Diagnosis
— Unverified 0Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy Oct 23, 2020 Off-policy evaluation
— Unverified 0A Practical Guide of Off-Policy Evaluation for Bandit Problems Oct 23, 2020 Off-policy evaluation
— Unverified 0CoinDICE: Off-Policy Confidence Interval Estimation Oct 22, 2020 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Optimal Off-Policy Evaluation from Multiple Logging Policies Oct 21, 2020 Off-policy evaluation
Code Code Available 1Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space Sep 28, 2020 Off-policy evaluation Q-Learning
— Unverified 0Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation Aug 17, 2020 Off-policy evaluation
Code Code Available 1