| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Jan 24, 2025 | Graph AttentionGraph Neural Network | CodeCode Available | 1 | 5 |
| An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search | Dec 10, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Learning Opinion Summarizers by Selecting Informative Reviews | Sep 9, 2021 | Few-Shot LearningOpinion Summarization | CodeCode Available | 1 | 5 |
| Online Portfolio Management via Deep Reinforcement Learning with High-Frequency Data | May 1, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 | 5 |
| Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning | Jul 12, 2022 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 | 5 |
| Learning Multi-Agent Communication through Structured Attentive Reasoning | Dec 1, 2020 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting | Jul 14, 2020 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 | 5 |
| Model-free Policy Learning with Reward Gradients | Mar 9, 2021 | Continuous Controlmodel | CodeCode Available | 1 | 5 |
| Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods | Feb 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment | Jul 26, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Hindsight Trust Region Policy Optimization | Jul 29, 2019 | Atari GamesPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents | Dec 18, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Action-depedent Control Variates for Policy Optimization via Stein's Identity | Oct 30, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| High-Dimensional Continuous Control Using Generalized Advantage Estimation | Jun 8, 2015 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets | Apr 3, 2025 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning | Jul 21, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch | Nov 4, 2021 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Understanding the Effects of Second-Order Approximations in Natural Policy Gradient Reinforcement Learning | Jan 22, 2022 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning | Aug 2, 2019 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Hindsight policy gradients | Nov 16, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradient Methods | Dec 1, 2019 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Evaluating Rewards for Question Generation Models | Feb 28, 2019 | Machine TranslationPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Dual Learning for Machine Translation | Nov 1, 2016 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |