Cross-Validated Off-Policy Evaluation
2024-05-24Code Available0· sign in to hype
Matej Cief, Branislav Kveton, Michal Kompan
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/navarog/cross-validated-opeOfficialIn paperpytorch★ 4
Abstract
We study estimator selection and hyper-parameter tuning in off-policy evaluation. Although cross-validation is the most popular method for model selection in supervised learning, off-policy evaluation relies mostly on theory, which provides only limited guidance to practitioners. We show how to use cross-validation for off-policy evaluation. This challenges a popular belief that cross-validation in off-policy evaluation is not feasible. We evaluate our method empirically and show that it addresses a variety of use cases.