| Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning | Jun 1, 2020 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 1 |
| Distributional Policy Optimization: An Alternative Approach for Continuous Control | May 23, 2019 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning | Nov 4, 2018 | DecoderMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Self-critical Sequence Training for Image Captioning | Dec 2, 2016 | Image CaptioningPolicy Gradient Methods | CodeCode Available | 1 |
| Trust Region Policy Optimization | Feb 19, 2015 | Atari GamesPolicy Gradient Methods | CodeCode Available | 1 |
| Improving DAPO from a Mixed-Policy Perspective | Jul 17, 2025 | Policy Gradient Methods | —Unverified | 0 |
| Local Pairwise Distance Matching for Backpropagation-Free Reinforcement Learning | Jul 15, 2025 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Solving Zero-Sum Convex Markov Games | Jun 19, 2025 | Policy Gradient Methods | —Unverified | 0 |
| Enhanced DACER Algorithm with High Diffusion Efficiency | May 29, 2025 | DenoisingImitation Learning | —Unverified | 0 |
| On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment | May 29, 2025 | Federated LearningPolicy Gradient Methods | —Unverified | 0 |