| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 | 5 |
| On Learning Intrinsic Rewards for Policy Gradient Methods | Apr 17, 2018 | Atari GamesDecision Making | CodeCode Available | 0 | 5 |
| Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning | Jul 21, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents | Dec 18, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Neural Replicator Dynamics | Jun 1, 2019 | counterfactualDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning | Aug 2, 2019 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradient Methods | Dec 1, 2019 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| A general class of surrogate functions for stable and efficient reinforcement learning | Aug 12, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets | Apr 3, 2025 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |