| Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function | May 25, 2022 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure | May 20, 2022 | Efficient ExplorationPolicy Gradient Methods | CodeCode Available | 1 |
| Momentum-Based Policy Gradient with Second-Order Information | May 17, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Stochastic first-order methods for average-reward Markov decision processes | May 11, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Learning to Constrain Policy Optimization with Virtual Trust Region | Apr 20, 2022 | Atari GamesPolicy Gradient Methods | —Unverified | 0 |
| Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization | Apr 12, 2022 | Autonomous VehiclesPolicy Gradient Methods | —Unverified | 0 |
| Synthesis of Stabilizing Recurrent Equilibrium Network Controllers | Mar 31, 2022 | Policy Gradient Methods | CodeCode Available | 0 |
| Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach | Mar 29, 2022 | Hierarchical Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Non-Parametric Stochastic Policy Gradient with Strategic Retreat for Non-Stationary Environment | Mar 24, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Linear convergence of a policy gradient method for some finite horizon continuous time control problems | Mar 22, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |