| Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem | Aug 21, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Online Planning for Decentralized Stochastic Control with Partial History Sharing | Aug 6, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language | Jul 31, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation | Jul 30, 2019 | Decision MakingLearning-To-Rank | CodeCode Available | 0 |
| Bandit Convex Optimization in Non-stationary Environments | Jul 29, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Scaling Multi-Armed Bandit Algorithms | Jul 25, 2019 | Multi-Armed BanditsSequential Decision Making | —Unverified | 0 |
| IR-VIC: Unsupervised Discovery of Sub-goals for Transfer in RL | Jul 24, 2019 | Decision MakingHierarchical Reinforcement Learning | —Unverified | 0 |
| A Sufficient Statistic for Influence in Structured Multiagent Environments | Jul 22, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Reward Advancement: Transforming Policy under Maximum Causal Entropy Principle | Jul 11, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| A Scheme for Dynamic Risk-Sensitive Sequential Decision Making | Jul 9, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |