| Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences | Jul 17, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization | Sep 25, 2019 | Instruction FollowingPolicy Gradient Methods | —Unverified | 0 |
| A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee | Feb 11, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Ad Headline Generation using Self-Critical Masked Language Model | Jun 1, 2021 | Headline GenerationLanguage Modeling | —Unverified | 0 |
| Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems | Nov 1, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity | Jan 24, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |
| Approximation Benefits of Policy Gradient Methods with Aggregated States | Jul 22, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Countering Language Drift via Grounding | Sep 27, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for Linearized Control Problems | Jan 1, 2018 | continuous-controlContinuous Control | —Unverified | 0 |