| Information Maximizing Exploration with a Latent Dynamics Model | Apr 4, 2018 | continuous-controlContinuous Control | —Unverified | 0 |
| Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines | Mar 20, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| The Mirage of Action-Dependent Baselines in Reinforcement Learning | Feb 27, 2018 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 |
| Optimizing over a Restricted Policy Class in Markov Decision Processes | Feb 26, 2018 | Policy Gradient Methods | —Unverified | 0 |
| Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning | Feb 22, 2018 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Clipped Action Policy Gradient | Feb 21, 2018 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Policy Gradients for Contextual Recommendations | Feb 12, 2018 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator | Jan 15, 2018 | continuous-controlContinuous Control | —Unverified | 0 |
| Expected Policy Gradients for Reinforcement Learning | Jan 10, 2018 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for Linearized Control Problems | Jan 1, 2018 | continuous-controlContinuous Control | —Unverified | 0 |