| Focused Hierarchical RNNs for Conditional Sequence Processing | Jun 12, 2018 | Open-Domain Question AnsweringPolicy Gradient Methods | —Unverified | 0 |
| Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions | Jun 16, 2024 | Multi-Armed BanditsPolicy Gradient Methods | —Unverified | 0 |
| Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games | Jun 15, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings | Oct 30, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning | May 10, 2024 | MisconceptionsMulti-agent Reinforcement Learning | —Unverified | 0 |
| Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching | Apr 27, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control | Oct 8, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies | Jun 19, 2019 | Autonomous DrivingPolicy Gradient Methods | —Unverified | 0 |
| Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial | May 17, 2021 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |
| On Linear Convergence of Policy Gradient Methods for Finite MDPs | Jul 21, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Adaptive Step-Size for Policy Gradient Methods | Dec 1, 2013 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control | Nov 30, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch | Nov 4, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Global Optimality Guarantees For Policy Gradient Methods | Jun 5, 2019 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles | Mar 18, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences | Jul 17, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization | Sep 25, 2019 | Instruction FollowingPolicy Gradient Methods | —Unverified | 0 |
| A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee | Feb 11, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Ad Headline Generation using Self-Critical Masked Language Model | Jun 1, 2021 | Headline GenerationLanguage Modeling | —Unverified | 0 |
| Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems | Nov 1, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity | Jan 24, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |
| Approximation Benefits of Policy Gradient Methods with Aggregated States | Jul 22, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Countering Language Drift via Grounding | Sep 27, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for Linearized Control Problems | Jan 1, 2018 | continuous-controlContinuous Control | —Unverified | 0 |