| An Off-policy Policy Gradient Theorem Using Emphatic Weightings | Nov 22, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Reward-estimation variance elimination in sequential decision processes | Nov 15, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Risk-Sensitive Reinforcement Learning via Policy Gradient Search | Oct 22, 2018 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 |
| Policy Gradient in Partially Observable Environments: Approximation and Convergence | Oct 18, 2018 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods | Oct 5, 2018 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Training for Diversity in Image Paragraph Captioning | Oct 1, 2018 | DiversityImage Captioning | CodeCode Available | 0 |
| CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization | Oct 1, 2018 | Abstractive Text SummarizationImage Captioning | —Unverified | 0 |
| Countering Language Drift via Grounding | Sep 27, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Assumption Questioning: Latent Copying and Reward Exploitation in Question Generation | Sep 27, 2018 | Inductive BiasMachine Translation | —Unverified | 0 |