Balancing Immediate Revenue and Future Off-Policy Evaluation in Coupon Allocation Jul 6, 2024 Off-policy evaluation
— Unverified 0Off-policy Evaluation with Deeply-abstracted States Jun 27, 2024 Off-policy evaluation
Code Code Available 0Confident Natural Policy Gradient for Local Planning in q_π-realizable Constrained MDPs Jun 26, 2024 Off-policy evaluation
— Unverified 0Automated Off-Policy Estimator Selection via Supervised Learning Jun 26, 2024 counterfactual Off-policy evaluation
— Unverified 0Off-Policy Evaluation from Logged Human Feedback Jun 14, 2024 Off-policy evaluation
— Unverified 0RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation Jun 3, 2024 LEMMA Off-policy evaluation
— Unverified 0A Fast Convergence Theory for Offline Decision Making Jun 3, 2024 Decision Making Offline RL
— Unverified 0Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies May 29, 2024 Metric Learning Off-policy evaluation
Code Code Available 0OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators May 27, 2024 Decision Making Offline RL
— Unverified 0Cross-Validated Off-Policy Evaluation May 24, 2024 Model Selection Off-policy evaluation
Code Code Available 0Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning May 23, 2024 Off-policy evaluation
Code Code Available 0Long-term Off-Policy Evaluation and Learning Apr 24, 2024 Off-policy evaluation
Code Code Available 0Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It Apr 23, 2024 counterfactual Decision Making
— Unverified 0Data Poisoning Attacks on Off-Policy Policy Evaluation Methods Apr 6, 2024 Data Poisoning Off-policy evaluation
— Unverified 0Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation Apr 3, 2024 Off-policy evaluation reinforcement-learning
— Unverified 0Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy Apr 2, 2024 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Predictive Performance Comparison of Decision Policies Under Confounding Apr 1, 2024 Causal Inference Decision Making
Code Code Available 0Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes Mar 29, 2024 Off-policy evaluation
Code Code Available 0Cramming Contextual Bandits for On-policy Statistical Evaluation Mar 11, 2024 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Bayesian Off-Policy Evaluation and Learning for Large Action Spaces Feb 22, 2024 Computational Efficiency Off-policy evaluation
— Unverified 0On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation Feb 22, 2024 Off-policy evaluation
— Unverified 0Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap Feb 13, 2024 Off-policy evaluation
— Unverified 0Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction Feb 3, 2024 Marketing Multi-Armed Bandits
Code Code Available 0Distributional Off-policy Evaluation with Bellman Residual Minimization Feb 2, 2024 Distributional Reinforcement Learning Off-policy evaluation
Code Code Available 0Probabilistic Offline Policy Ranking with Approximate Bayesian Computation Dec 17, 2023 Off-policy evaluation
— Unverified 0RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions Dec 11, 2023 Multi-Armed Bandits Off-policy evaluation
Code Code Available 0Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits Dec 3, 2023 Causal Inference Multi-Armed Bandits
Code Code Available 0When is Off-Policy Evaluation (Reward Modeling) Useful in Contextual Bandits? A Data-Centric Perspective Nov 23, 2023 Large Language Model Multi-Armed Bandits
Code Code Available 0Unbiased Offline Evaluation for Learning to Rank with Business Rules Nov 3, 2023 Learning-To-Rank Off-policy evaluation
— Unverified 0Robust Offline Reinforcement learning with Heavy-Tailed Rewards Oct 28, 2023 Offline RL Off-policy evaluation
Code Code Available 0State-Action Similarity-Based Representations for Off-Policy Evaluation Oct 27, 2023 Off-policy evaluation Representation Learning
Code Code Available 0Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation Oct 26, 2023 counterfactual Off-policy evaluation
Code Code Available 0Off-Policy Evaluation for Large Action Spaces via Policy Convolution Oct 24, 2023 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks Oct 16, 2023 Off-policy evaluation reinforcement-learning
— Unverified 0Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization Oct 15, 2023 Multi-agent Reinforcement Learning Off-policy evaluation
— Unverified 0Off-Policy Evaluation for Human Feedback Oct 11, 2023 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework Sep 23, 2023 Off-policy evaluation
— Unverified 0Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits Sep 15, 2023 Multi-Armed Bandits Off-policy evaluation
— Unverified 0Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning Aug 28, 2023 D4RL Off-policy evaluation
— Unverified 0Distributional Off-Policy Evaluation for Slate Recommendations Aug 27, 2023 Fairness Off-policy evaluation
Code Code Available 0Doubly Robust Estimator for Off-Policy Evaluation with Large Action Spaces Aug 7, 2023 Off-policy evaluation
Code Code Available 0On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n Recommendation Jul 27, 2023 Information Retrieval Off-policy evaluation
Code Code Available 0The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation Jul 25, 2023 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning Jul 21, 2023 Decision Making Deep Reinforcement Learning
Code Code Available 0Leveraging Factored Action Spaces for Off-Policy Evaluation Jul 13, 2023 counterfactual Off-policy evaluation
Code Code Available 0Off-policy Evaluation in Doubly Inhomogeneous Environments Jun 14, 2023 Offline RL Off-policy evaluation
Code Code Available 0K-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control Jun 7, 2023 counterfactual Off-policy evaluation
Code Code Available 0Counterfactual Evaluation of Peer-Review Assignment Policies May 27, 2023 counterfactual Off-policy evaluation
Code Code Available 0Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation May 17, 2023 Decision Making Off-policy evaluation
Code Code Available 0Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems May 17, 2023 Off-policy evaluation regression
— Unverified 0