| Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate | Mar 1, 2024 | Policy Gradient Methods | —Unverified | 0 |
| When Do Off-Policy and On-Policy Policy Gradient Methods Align? | Feb 19, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Identifying Policy Gradient Subspaces | Jan 12, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction | Jan 2, 2024 | MuJoCoPolicy Gradient Methods | —Unverified | 0 |
| Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning | Jan 1, 2024 | Decision MakingDiversity | —Unverified | 0 |
| Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence Beyond the Minty Property | Dec 19, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Privacy Preserving Multi-Agent Reinforcement Learning in Supply Chains | Dec 9, 2023 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation | Dec 8, 2023 | 3D GenerationDenoising | —Unverified | 0 |
| Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems | Dec 5, 2023 | FormModel-based Reinforcement Learning | —Unverified | 0 |
| Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization | Nov 30, 2023 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 |