Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model Feb 3, 2022 Multi-Armed Bandits Off-policy evaluation
Code Code Available 25 Off-Policy Evaluation for Large Action Spaces via Embeddings Feb 13, 2022 Multi-Armed Bandits Off-policy evaluation
Code Code Available 25 COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation Apr 19, 2022 Offline RL Off-policy evaluation
Code Code Available 15 Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions Jul 25, 2020 counterfactual News Recommendation
Code Code Available 15 Anytime-valid off-policy inference for contextual bandits Oct 19, 2022 counterfactual Multi-Armed Bandits
Code Code Available 15 BCORLE(): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market Dec 1, 2021 Off-policy evaluation reinforcement-learning
Code Code Available 15 Trajectory World Models for Heterogeneous Environments Feb 3, 2025 Diversity Model Predictive Control
Code Code Available 15 Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings Jul 23, 2021 Computational Efficiency Decision Making
Code Code Available 15 Evaluating the Robustness of Off-Policy Evaluation Aug 31, 2021 Off-policy evaluation Recommendation Systems
Code Code Available 15 Optimal Off-Policy Evaluation from Multiple Logging Policies Oct 21, 2020 Off-policy evaluation
Code Code Available 15 Benchmarks for Deep Off-Policy Evaluation Mar 30, 2021 Benchmarking continuous-control
Code Code Available 15 A Policy-Guided Imitation Approach for Offline Reinforcement Learning Oct 15, 2022 D4RL Offline RL
Code Code Available 15 Off-Policy Evaluation of Ranking Policies under Diverse User Behavior Jun 26, 2023 Off-policy evaluation
Code Code Available 15 A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation Jun 12, 2021 Deep Reinforcement Learning MuJoCo
Code Code Available 15 Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation Jun 24, 2021 Meta Reinforcement Learning Off-policy evaluation
Code Code Available 15 Offline RL Without Off-Policy Evaluation Jun 16, 2021 D4RL Offline RL
Code Code Available 15 Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits Jun 3, 2021 Multi-Armed Bandits Off-policy evaluation
Code Code Available 15 Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation Nov 30, 2023 Benchmarking counterfactual
Code Code Available 15 Active Offline Policy Selection Jun 18, 2021 Bayesian Optimization Off-policy evaluation
Code Code Available 15 Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning Feb 19, 2022 Off-policy evaluation
Code Code Available 15 Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation Aug 17, 2020 Off-policy evaluation
Code Code Available 15 SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation Nov 30, 2023 Offline RL Off-policy evaluation
Code Code Available 15 Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning Jul 21, 2023 Decision Making Deep Reinforcement Learning
Code Code Available 05 Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity Nov 5, 2020 Diversity Off-policy evaluation
Code Code Available 05 Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation May 17, 2023 Decision Making Off-policy evaluation
Code Code Available 05 Adaptive Estimator Selection for Off-Policy Evaluation Feb 18, 2020 Multi-Armed Bandits Off-policy evaluation
Code Code Available 05 Balanced off-policy evaluation in general action spaces Jun 9, 2019 Binary Classification counterfactual
Code Code Available 05 Batch Stationary Distribution Estimation Mar 2, 2020 Off-policy evaluation
Code Code Available 05 Future-Dependent Value-Based Off-Policy Evaluation in POMDPs Jul 26, 2022 Off-policy evaluation
Code Code Available 05 Off-policy Evaluation with Deeply-abstracted States Jun 27, 2024 Off-policy evaluation
Code Code Available 05 Conformal Off-policy Prediction Jun 14, 2022 Conformal Prediction Off-policy evaluation
Code Code Available 05 From Importance Sampling to Doubly Robust Policy Gradient Oct 20, 2019 Off-policy evaluation
Code Code Available 05 Hallucinated Adversarial Control for Conservative Offline Policy Evaluation Mar 2, 2023 continuous-control Continuous Control
Code Code Available 05 Importance Sampling Policy Evaluation with an Estimated Behavior Policy Jun 4, 2018 Off-policy evaluation
Code Code Available 05 Doubly Robust Kernel Statistics for Testing Distributional Treatment Effects Dec 9, 2022 Causal Inference counterfactual
Code Code Available 05 Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes Aug 22, 2019 Off-policy evaluation reinforcement-learning
Code Code Available 05 A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes Nov 12, 2021 Off-policy evaluation
Code Code Available 05 Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation Jun 7, 2021 Off-policy evaluation
Code Code Available 05 Distributional Off-policy Evaluation with Bellman Residual Minimization Feb 2, 2024 Distributional Reinforcement Learning Off-policy evaluation
Code Code Available 05 Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings Oct 29, 2020 Change Point Detection Off-policy evaluation
Code Code Available 05 Deeply-Debiased Off-Policy Interval Estimation May 10, 2021 Off-policy evaluation
Code Code Available 05 Distributional Off-Policy Evaluation for Slate Recommendations Aug 27, 2023 Fairness Off-policy evaluation
Code Code Available 05 DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects May 2, 2025 Imputation Off-policy evaluation
Code Code Available 05 Doubly robust off-policy evaluation with shrinkage Jul 22, 2019 Model Selection Multi-Armed Bandits
Code Code Available 05 Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting Jun 18, 2020 Multi-Armed Bandits Off-policy evaluation
Code Code Available 05 Doubly Robust Estimator for Off-Policy Evaluation with Large Action Spaces Aug 7, 2023 Off-policy evaluation
Code Code Available 05 Counterfactual Mean Embeddings May 22, 2018 Causal Inference counterfactual
Code Code Available 05 A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets Feb 21, 2022 Management Multi-agent Reinforcement Learning
Code Code Available 05 Balanced Off-Policy Evaluation for Personalized Pricing Feb 24, 2023 Off-policy evaluation
Code Code Available 05 Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences Jul 25, 2024 Off-policy evaluation
Code Code Available 05