| Why Should I Trust You, Bellman? Evaluating the Bellman Objective with Off-Policy Data | Sep 29, 2021 | Deep Reinforcement LearningOff-policy evaluation | —Unverified | 0 |
| Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis | Jun 23, 2021 | Domain AdaptationImage Generation | —Unverified | 0 |
| RCURRENCY: Live Digital Asset Trading Using a Recurrent Neural Network-based Forecasting System | Jun 13, 2021 | Value prediction | —Unverified | 0 |
| Turing: an Accurate and Interpretable Multi-Hypothesis Cross-Domain Natural Language Database Interface | Jun 8, 2021 | Text GenerationText to SQL | —Unverified | 0 |
| On the Optimality of Batch Policy Optimization Algorithms | Apr 6, 2021 | Value prediction | —Unverified | 0 |
| Learning State Representations from Random Deep Action-conditional Predictions | Feb 9, 2021 | Atari GamesReinforcement Learning (RL) | CodeCode Available | 0 |
| The Value Equivalence Principle for Model-Based Reinforcement Learning | Nov 6, 2020 | Model-based Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Rethinking Deep Policy Gradients via State-Wise Policy Improvement | Oct 19, 2020 | Policy Gradient MethodsValue prediction | —Unverified | 0 |
| timeXplain -- A Framework for Explaining the Predictions of Time Series Classifiers | Jul 15, 2020 | Decision MakingExplainable artificial intelligence | CodeCode Available | 0 |
| The Value-Improvement Path: Towards Better Representations for Reinforcement Learning | Jun 3, 2020 | Atari Gamesreinforcement-learning | —Unverified | 0 |