| Policy-Aware Model Learning for Policy Gradient Methods | Feb 28, 2020 | modelModel-based Reinforcement Learning | CodeCode Available | 0 |
| Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning | Jan 8, 2025 | Policy Gradient MethodsReinforcement Learning (RL) | CodeCode Available | 0 |
| The Performance Impact of Combining Agent Factorization with Different Learning Algorithms for Multiagent Coordination | Sep 9, 2022 | ManagementPolicy Gradient Methods | CodeCode Available | 0 |
| Policy Gradient for Robust Markov Decision Processes | Oct 29, 2024 | Policy Gradient Methods | CodeCode Available | 0 |
| V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control | Sep 26, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form | Aug 29, 2024 | FormPolicy Gradient Methods | CodeCode Available | 0 |
| Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents | Dec 18, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Convergence Guarantees of Model-free Policy Gradient Methods for LQR with Stochastic Data | Feb 27, 2025 | Policy Gradient Methods | CodeCode Available | 0 |
| Neural Logic Reinforcement Learning | Apr 24, 2019 | Deep Reinforcement LearningInductive logic programming | CodeCode Available | 0 |
| On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning | Feb 12, 2020 | Meta-LearningMeta Reinforcement Learning | CodeCode Available | 0 |
| Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods | Nov 6, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Run, skeleton, run: skeletal model in a physics-based simulation | Nov 18, 2017 | NavigatePolicy Gradient Methods | CodeCode Available | 0 |
| Client Selection for Federated Policy Optimization with Environment Heterogeneity | May 18, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Training for Diversity in Image Paragraph Captioning | Oct 1, 2018 | DiversityImage Captioning | CodeCode Available | 0 |
| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 |
| Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic | Nov 7, 2016 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Evaluating Rewards for Question Generation Models | Feb 28, 2019 | Machine TranslationPolicy Gradient Methods | CodeCode Available | 0 |
| Dual Learning for Machine Translation | Nov 1, 2016 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| On Learning Intrinsic Rewards for Policy Gradient Methods | Apr 17, 2018 | Atari GamesDecision Making | CodeCode Available | 0 |
| Cold-Start Reinforcement Learning with Softmax Policy Gradient | Sep 27, 2017 | Image CaptioningPolicy Gradient Methods | CodeCode Available | 0 |
| On-Policy Trust Region Policy Optimisation with Replay Buffers | Jan 18, 2019 | Continuous ControlDeep Reinforcement Learning | CodeCode Available | 0 |
| Trajectory-Based Off-Policy Deep Reinforcement Learning | May 14, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Policy Gradient in Robust MDPs with Global Convergence Guarantee | Dec 20, 2022 | Policy Gradient Methods | CodeCode Available | 0 |
| Clipped Action Policy Gradient | Feb 21, 2018 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Learning Goal-Oriented Visual Dialog via Tempered Policy Gradient | Jul 2, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Ranking Policy Gradient | Jun 24, 2019 | Policy Gradient MethodsReinforcement Learning | CodeCode Available | 0 |
| Divide-and-Conquer Reinforcement Learning | Nov 27, 2017 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Bayesian Policy Gradients via Alpha Divergence Dropout Inference | Dec 6, 2017 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Distributional constrained reinforcement learning for supply chain optimization | Feb 3, 2023 | Distributional Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent | Jun 2, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Neural Replicator Dynamics | Jun 1, 2019 | counterfactualDeep Reinforcement Learning | CodeCode Available | 0 |
| Understanding the Effects of Second-Order Approximations in Natural Policy Gradient Reinforcement Learning | Jan 22, 2022 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 |