| Distributional Policy Optimization: An Alternative Approach for Continuous Control | May 23, 2019 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Experimental design for MRI by greedy policy search | Oct 30, 2020 | Experimental DesignPolicy Gradient Methods | CodeCode Available | 1 |
| Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization | Oct 3, 2022 | Decision MakingPolicy Gradient Methods | CodeCode Available | 1 |
| Learning Multi-Agent Communication through Structured Attentive Reasoning | Dec 1, 2020 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting | Jul 14, 2020 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 |
| Model-free Policy Learning with Reward Gradients | Mar 9, 2021 | Continuous Controlmodel | CodeCode Available | 1 |
| An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Jan 24, 2025 | Graph AttentionGraph Neural Network | CodeCode Available | 1 |
| An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search | Dec 10, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning | Jun 1, 2020 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 1 |
| Self-critical Sequence Training for Image Captioning | Dec 2, 2016 | Image CaptioningPolicy Gradient Methods | CodeCode Available | 1 |