| Entropy annealing for policy mirror descent in continuous time and space | May 30, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Mollification Effects of Policy Gradient Methods | May 28, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges | May 27, 2024 | AcrobotPolicy Gradient Methods | —Unverified | 0 |
| Matrix Low-Rank Approximation For Policy Gradient Methods | May 27, 2024 | Matrix CompletionPolicy Gradient Methods | CodeCode Available | 0 |
| Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence | May 23, 2024 | Distributional Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Almost sure convergence rates of stochastic gradient methods under gradient domination | May 22, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning | May 10, 2024 | MisconceptionsMulti-agent Reinforcement Learning | —Unverified | 0 |
| Federated Reinforcement Learning with Constraint Heterogeneity | May 6, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline | May 4, 2024 | Computational EfficiencyMuJoCo | —Unverified | 0 |
| Information-Theoretic Opacity-Enforcement in Markov Decision Processes | Apr 30, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching | Apr 27, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Actor-Critic Reinforcement Learning with Phased Actor | Apr 18, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report | Apr 5, 2024 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Elementary Analysis of Policy Gradient Methods | Apr 4, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement | Mar 22, 2024 | Combinatorial OptimizationImitation Learning | CodeCode Available | 1 |
| ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy | Mar 21, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles | Mar 18, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries | Mar 15, 2024 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis | Mar 13, 2024 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Provable Policy Gradient Methods for Average-Reward Markov Potential Games | Mar 9, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control | Mar 7, 2024 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process | Mar 7, 2024 | Drug DesignPolicy Gradient Methods | —Unverified | 0 |
| Towards Provable Log Density Policy Gradient | Mar 3, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate | Mar 1, 2024 | Policy Gradient Methods | —Unverified | 0 |
| When Do Off-Policy and On-Policy Policy Gradient Methods Align? | Feb 19, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Identifying Policy Gradient Subspaces | Jan 12, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction | Jan 2, 2024 | MuJoCoPolicy Gradient Methods | —Unverified | 0 |
| Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning | Jan 1, 2024 | Decision MakingDiversity | —Unverified | 0 |
| Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence Beyond the Minty Property | Dec 19, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Privacy Preserving Multi-Agent Reinforcement Learning in Supply Chains | Dec 9, 2023 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation | Dec 8, 2023 | 3D GenerationDenoising | —Unverified | 0 |
| Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems | Dec 5, 2023 | FormModel-based Reinforcement Learning | —Unverified | 0 |
| Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization | Nov 30, 2023 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 |
| A Large Deviations Perspective on Policy Gradient Algorithms | Nov 13, 2023 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Clipped-Objective Policy Gradients for Pessimistic Policy Optimization | Nov 10, 2023 | Deep Reinforcement LearningMulti-Task Learning | CodeCode Available | 0 |
| On the Second-Order Convergence of Biased Policy Gradient Algorithms | Nov 5, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Riemannian stochastic optimization methods avoid strict saddle points | Nov 4, 2023 | Dictionary LearningPolicy Gradient Methods | —Unverified | 0 |
| Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning | Nov 1, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback | Oct 29, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning | Oct 18, 2023 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 |
| f-Policy Gradients: A General Framework for Goal Conditioned RL using f-Divergences | Oct 10, 2023 | Efficient ExplorationPolicy Gradient Methods | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control | Oct 8, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods | Oct 8, 2023 | Policy Gradient MethodsTraveling Salesman Problem | —Unverified | 0 |
| Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods | Oct 4, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds | Sep 25, 2023 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient Approach | Sep 19, 2023 | Policy Gradient Methods | CodeCode Available | 0 |
| Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence | Sep 8, 2023 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Commodities Trading through Deep Policy Gradient Methods | Aug 10, 2023 | Algorithmic TradingDeep Reinforcement Learning | —Unverified | 0 |
| Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning | Jul 21, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 |