| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |
| Entropy annealing for policy mirror descent in continuous time and space | May 30, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Approximation Benefits of Policy Gradient Methods with Aggregated States | Jul 22, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems | Nov 1, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Equivalence Between Policy Gradients and Soft Q-Learning | Apr 21, 2017 | Policy Gradient MethodsQ-Learning | —Unverified | 0 |
| Equivalence of stochastic and deterministic policy gradients | May 29, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Optimal Rates of Convergence for Entropy Regularization in Discounted Markov Decision Processes | Jun 6, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Evolutionary Policy Optimization | Apr 17, 2025 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee | Feb 11, 2023 | Policy Gradient Methods | —Unverified | 0 |