| Neural Logic Reinforcement Learning | Apr 24, 2019 | Deep Reinforcement LearningInductive logic programming | CodeCode Available | 0 | 5 |
| Policy Gradient for Robust Markov Decision Processes | Oct 29, 2024 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Momentum-Based Policy Gradient Methods | Jul 13, 2020 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning | Jan 8, 2025 | Policy Gradient MethodsReinforcement Learning (RL) | CodeCode Available | 0 | 5 |
| Learning Goal-Oriented Visual Dialog via Tempered Policy Gradient | Jul 2, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Action-depedent Control Variates for Policy Optimization via Stein's Identity | Oct 30, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment | Jul 26, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 | 5 |
| Clipped Action Policy Gradient | Feb 21, 2018 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Clipped-Objective Policy Gradients for Pessimistic Policy Optimization | Nov 10, 2023 | Deep Reinforcement LearningMulti-Task Learning | CodeCode Available | 0 | 5 |
| Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents | Dec 18, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient Approach | Sep 19, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning | Jul 21, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Neural Replicator Dynamics | Jun 1, 2019 | counterfactualDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Hindsight policy gradients | Nov 16, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets | Apr 3, 2025 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| A general class of surrogate functions for stable and efficient reinforcement learning | Aug 12, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| High-Dimensional Continuous Control Using Generalized Advantage Estimation | Jun 8, 2015 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Hindsight Trust Region Policy Optimization | Jul 29, 2019 | Atari GamesPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradient Methods | Dec 1, 2019 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Evaluating Rewards for Question Generation Models | Feb 28, 2019 | Machine TranslationPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Convergence Guarantees of Model-free Policy Gradient Methods for LQR with Stochastic Data | Feb 27, 2025 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradients | Feb 18, 2019 | Meta-LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch | Nov 4, 2021 | Policy Gradient Methods | CodeCode Available | 0 | 5 |