| On-Policy Trust Region Policy Optimisation with Replay Buffers | Jan 18, 2019 | Continuous ControlDeep Reinforcement Learning | CodeCode Available | 0 |
| Trajectory-Based Off-Policy Deep Reinforcement Learning | May 14, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Policy Gradient in Robust MDPs with Global Convergence Guarantee | Dec 20, 2022 | Policy Gradient Methods | CodeCode Available | 0 |
| Clipped Action Policy Gradient | Feb 21, 2018 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Learning Goal-Oriented Visual Dialog via Tempered Policy Gradient | Jul 2, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Ranking Policy Gradient | Jun 24, 2019 | Policy Gradient MethodsReinforcement Learning | CodeCode Available | 0 |
| Divide-and-Conquer Reinforcement Learning | Nov 27, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Bayesian Policy Gradients via Alpha Divergence Dropout Inference | Dec 6, 2017 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Distributional constrained reinforcement learning for supply chain optimization | Feb 3, 2023 | Distributional Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent | Jun 2, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |