| Competitive Policy Optimization | Jun 18, 2020 | Policy Gradient Methods | CodeCode Available | 1 | 5 |
| Learning Opinion Summarizers by Selecting Informative Reviews | Sep 9, 2021 | Few-Shot LearningOpinion Summarization | CodeCode Available | 1 | 5 |
| Deep Bayesian Quadrature Policy Optimization | Jun 28, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers | Nov 22, 2024 | AvgDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning | Jun 1, 2020 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 1 | 5 |
| An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Jan 24, 2025 | Graph AttentionGraph Neural Network | CodeCode Available | 1 | 5 |
| An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search | Dec 10, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization | Jun 20, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 | 5 |
| Self-critical Sequence Training for Image Captioning | Dec 2, 2016 | Image CaptioningPolicy Gradient Methods | CodeCode Available | 1 | 5 |