Optimal discharge of patients from intensive care via a data-driven policy learning framework Dec 17, 2021 Management Off-policy evaluation
— Unverified 0Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies Nov 29, 2020 Off-policy evaluation Recommendation Systems
— Unverified 0Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling Jun 8, 2019 Off-policy evaluation reinforcement-learning
— Unverified 0Practical Marginalized Importance Sampling with the Successor Representation Jan 1, 2021 Deep Reinforcement Learning MuJoCo
— Unverified 0Primal-Dual Spectral Representation for Off-policy Evaluation Oct 23, 2024 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Privacy Preserving Off-Policy Evaluation Feb 1, 2019 Off-policy evaluation Privacy Preserving
— Unverified 0Probabilistic Offline Policy Ranking with Approximate Bayesian Computation Dec 17, 2023 Off-policy evaluation
— Unverified 0Quantile Off-Policy Evaluation via Deep Conditional Generative Learning Dec 29, 2022 Decision Making Off-policy evaluation
— Unverified 0Reliable Off-policy Evaluation for Reinforcement Learning Nov 8, 2020 Decision Making Off-policy evaluation
— Unverified 0RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation Jun 3, 2024 LEMMA Off-policy evaluation
— Unverified 0Debiased Off-Policy Evaluation for Recommendation Systems Feb 20, 2020 counterfactual Off-policy evaluation
— Unverified 0Safe Evaluation For Offline Learning: Are We Ready To Deploy? Dec 16, 2022 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks Jun 6, 2022 Off-policy evaluation
— Unverified 0Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks Oct 16, 2023 Off-policy evaluation reinforcement-learning
— Unverified 0Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems Apr 14, 2022 Off-policy evaluation Self-Learning
— Unverified 0Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems May 17, 2023 Off-policy evaluation regression
— Unverified 0Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction Dec 14, 2022 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Semi-gradient DICE for Offline Constrained Reinforcement Learning Jun 10, 2025 Offline RL Off-policy evaluation
— Unverified 0STEEL: Singularity-aware Reinforcement Learning Jan 30, 2023 Off-policy evaluation reinforcement-learning
— Unverified 0Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint Jan 6, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Stabilizing Temporal Difference Learning via Implicit Stochastic Recursion May 2, 2025 Computational Efficiency Off-policy evaluation
— Unverified 0Stateful Offline Contextual Policy Evaluation and Learning Oct 19, 2021 Management Multi-Armed Bandits
— Unverified 0Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation Jul 27, 2020 continuous-control Continuous Control
— Unverified 0Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach Sep 12, 2022 Off-policy evaluation
— Unverified 0Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning Aug 28, 2023 D4RL Off-policy evaluation
— Unverified 0STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation May 27, 2025 D4RL Denoising
— Unverified 0Task Selection Policies for Multitask Learning Jul 14, 2019 counterfactual Natural Language Understanding
— Unverified 0Taylor Expansion Policy Optimization Mar 13, 2020 Off-policy evaluation reinforcement-learning
— Unverified 0Cramming Contextual Bandits for On-policy Statistical Evaluation Mar 11, 2024 Multi-Armed Bandits Off-policy evaluation
— Unverified 0The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation Jul 25, 2023 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes Sep 16, 2022 Decision Making Metric Learning
— Unverified 0Towards Robust Off-Policy Evaluation via Human Inputs Sep 18, 2022 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Triply Robust Off-Policy Evaluation Nov 13, 2019 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Unbiased Offline Evaluation for Learning to Rank with Business Rules Nov 3, 2023 Learning-To-Rank Off-policy evaluation
— Unverified 0Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling Oct 15, 2019 Off-policy evaluation Reinforcement Learning
— Unverified 0Variance-Aware Off-Policy Evaluation with Linear Function Approximation Jun 22, 2021 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits Sep 15, 2023 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Weighted model estimation for offline model-based reinforcement learning Dec 1, 2021 Density Ratio Estimation model
— Unverified 0Why Should I Trust You, Bellman? Evaluating the Bellman Objective with Off-Policy Data Sep 29, 2021 Deep Reinforcement Learning Off-policy evaluation
— Unverified 0Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service Sep 17, 2021 Decision Making Marketing
— Unverified 0Data Poisoning Attacks on Off-Policy Policy Evaluation Methods Apr 6, 2024 Data Poisoning Off-policy evaluation
— Unverified 0Debiasing Samples from Online Learning Using Bootstrap Jul 31, 2021 Off-policy evaluation Thompson Sampling
— Unverified 0Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space Sep 28, 2020 Off-policy evaluation Q-Learning
— Unverified 0Defining Admissible Rewards for High Confidence Policy Evaluation May 30, 2019 Off-policy evaluation Reinforcement Learning
— Unverified 0Designing an Interpretable Interface for Contextual Bandits Sep 23, 2024 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm Sep 24, 2024 Offline RL Off-policy evaluation
— Unverified 0Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning Apr 20, 2021 Clustering Decision Making
— Unverified 0Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework Sep 23, 2023 Off-policy evaluation
— Unverified 0Double/Debiased Machine Learning for Dynamic Treatment Effects via g-Estimation Feb 17, 2020 BIG-bench Machine Learning Model Selection
— Unverified 0Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation Jan 1, 2020 Off-policy evaluation reinforcement-learning
— Unverified 0