| Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior | Jul 12, 2023 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Provably Convergent Policy Optimization via Metric-aware Trust Region Methods | Jun 25, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |
| Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization | Jun 20, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 |
| Acceleration in Policy Optimization | Jun 18, 2023 | Meta-LearningPolicy Gradient Methods | —Unverified | 0 |
| Deep Policy Gradient Methods in Commodity Markets | Jun 14, 2023 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes | Jun 13, 2023 | Meta Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation | Jun 9, 2023 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Solving Robust MDPs through No-Regret Dynamics | May 30, 2023 | NavigatePolicy Gradient Methods | —Unverified | 0 |
| Adaptive Policy Learning to Additional Tasks | May 24, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models | May 19, 2023 | Efficient ExplorationLanguage Modeling | —Unverified | 0 |
| Client Selection for Federated Policy Optimization with Environment Heterogeneity | May 18, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Policy Gradient Methods in the Presence of Symmetries and State Abstractions | May 9, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Online Portfolio Management via Deep Reinforcement Learning with High-Frequency Data | May 1, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 |
| Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters | Mar 29, 2023 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Policy Mirror Descent Inherently Explores Action Space | Mar 8, 2023 | Efficient ExplorationGeneral Reinforcement Learning | —Unverified | 0 |
| Policy gradient learning methods for stochastic control with exit time and applications to share repurchase pricing | Feb 14, 2023 | Policy Gradient Methods | —Unverified | 0 |
| A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee | Feb 11, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Distributional constrained reinforcement learning for supply chain optimization | Feb 3, 2023 | Distributional Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies | Feb 3, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning | Feb 2, 2023 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Policy Gradient for Rectangular Robust Markov Decision Processes | Jan 31, 2023 | FormPolicy Gradient Methods | —Unverified | 0 |
| SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search | Jan 30, 2023 | GPUPolicy Gradient Methods | —Unverified | 0 |
| Stochastic Dimension-reduced Second-order Methods for Policy Optimization | Jan 28, 2023 | Policy Gradient MethodsSecond-order methods | —Unverified | 0 |
| On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures | Jan 26, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Partial advantage estimator for proximal policy optimization | Jan 26, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 1 |
| Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm | Dec 28, 2022 | ChatbotDeep Reinforcement Learning | —Unverified | 0 |
| On the Convergence of Discounted Policy Gradient Methods | Dec 28, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Policy Gradient in Robust MDPs with Global Convergence Guarantee | Dec 20, 2022 | Policy Gradient Methods | CodeCode Available | 0 |
| An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods | Nov 15, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Geometry and convergence of natural policy gradient methods | Nov 3, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems | Nov 1, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence | Oct 23, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Policy Gradient Methods for Designing Dynamic Output Feedback Controllers | Oct 18, 2022 | Policy Gradient Methods | —Unverified | 0 |
| On the convergence of policy gradient methods to Nash equilibria in general stochastic games | Oct 17, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies | Oct 4, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization | Oct 3, 2022 | Decision MakingPolicy Gradient Methods | CodeCode Available | 1 |
| SoftTreeMax: Policy Gradient with Tree Search | Sep 28, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning | Sep 20, 2022 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| Continuous MDP Homomorphisms and Homomorphic Policy Gradient | Sep 15, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator | Sep 12, 2022 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| The Performance Impact of Combining Agent Factorization with Different Learning Algorithms for Multiagent Coordination | Sep 9, 2022 | ManagementPolicy Gradient Methods | CodeCode Available | 0 |
| Natural Policy Gradients In Reinforcement Learning Explained | Sep 5, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Towards Global Optimality in Cooperative MARL with the Transformation And Distillation Framework | Jul 12, 2022 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning | Jul 12, 2022 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 |
| Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games | Jun 15, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization | Jun 14, 2022 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| How are policy gradient methods affected by the limits of control? | Jun 14, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Learning Dynamics and Generalization in Reinforcement Learning | Jun 5, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |