| Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior | Jul 12, 2023 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Learning Dynamics and Generalization in Reinforcement Learning | Jun 5, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs | May 21, 2025 | Combinatorial OptimizationPolicy Gradient Methods | —Unverified | 0 |
| Learning in complex action spaces without policy gradients | Oct 8, 2024 | Policy Gradient MethodsQ-Learning | —Unverified | 0 |
| Learning Novel Policies For Tasks | May 13, 2019 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Learning Self-Imitating Diverse Policies | May 25, 2018 | continuous-controlContinuous Control | —Unverified | 0 |
| Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration | Jul 30, 2018 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 |
| Lifelong Learning of Factored Policies via Policy Gradients | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Policy Gradient Methods for Distortion Risk Measures | Jul 9, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Linear convergence of a policy gradient method for some finite horizon continuous time control problems | Mar 22, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies | Oct 4, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges | May 27, 2024 | AcrobotPolicy Gradient Methods | —Unverified | 0 |
| Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods | Oct 9, 2019 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning | Oct 16, 2021 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Local Pairwise Distance Matching for Backpropagation-Free Reinforcement Learning | Jul 15, 2025 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Manifold Regularization for Kernelized LSTD | Oct 15, 2017 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods | Nov 4, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Learning to Constrain Policy Optimization with Virtual Trust Region | Apr 20, 2022 | Atari GamesPolicy Gradient Methods | —Unverified | 0 |
| Meta Learning the Step Size in Policy Gradient Methods | May 20, 2021 | Meta-LearningMeta Reinforcement Learning | —Unverified | 0 |
| Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation | Feb 2, 2025 | Policy Gradient Methods | —Unverified | 0 |
| Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment | Jun 28, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Mollification Effects of Policy Gradient Methods | May 28, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach | Mar 29, 2022 | Hierarchical Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Multiagent Soft Q-Learning | Apr 25, 2018 | Policy Gradient MethodsQ-Learning | —Unverified | 0 |
| Multi Pseudo Q-learning Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles | Sep 7, 2019 | Policy Gradient MethodsQ-Learning | —Unverified | 0 |