| Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods | Aug 8, 2019 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning | Aug 2, 2019 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift | Aug 1, 2019 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Hindsight Trust Region Policy Optimization | Jul 29, 2019 | Atari GamesPolicy Gradient Methods | CodeCode Available | 0 |
| Variance Reduction in Actor Critic Methods (ACM) | Jul 23, 2019 | Policy Gradient Methods | —Unverified | 0 |
| Shapley Q-value: A Local Reward Approach to Solve Global Reward Games | Jul 11, 2019 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Policy Optimization with Stochastic Mirror Descent | Jun 25, 2019 | Continuous ControlPolicy Gradient Methods | —Unverified | 0 |
| Ranking Policy Gradient | Jun 24, 2019 | Policy Gradient MethodsReinforcement Learning | CodeCode Available | 0 |
| Entropic Risk Measure in Policy Search | Jun 21, 2019 | Policy Gradient Methods | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies | Jun 19, 2019 | Autonomous DrivingPolicy Gradient Methods | —Unverified | 0 |