| Augmented Bayesian Policy Search | Jul 5, 2024 | Bayesian OptimizationLEMMA | —Unverified | 0 | 0 |
| Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning | Feb 22, 2018 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| A K-fold Method for Baseline Estimation in Policy Gradient Algorithms | Jan 3, 2017 | MuJoCoPolicy Gradient Methods | —Unverified | 0 | 0 |
| Learning Dynamics and Generalization in Reinforcement Learning | Jun 5, 2022 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Difference Rewards Policy Gradients | Dec 21, 2020 | counterfactualMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Asynchronous Multi-Agent Actor-Critic with Macro-Actions | Sep 29, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Is the Policy Gradient a Gradient? | Jun 17, 2019 | Open-Ended Question AnsweringPolicy Gradient Methods | —Unverified | 0 | 0 |
| Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs | Aug 19, 2024 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning | Sep 20, 2022 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Information-Theoretic Opacity-Enforcement in Markov Decision Processes | Apr 30, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Deep Reinforcement Learning based Blind mmWave MIMO Beam Alignment | Jan 25, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Information Maximizing Exploration with a Latent Dynamics Model | Apr 4, 2018 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Independent Policy Gradient Methods for Competitive Reinforcement Learning | Jan 11, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| A Study of Policy Gradient on a Class of Exactly Solvable Models | Nov 3, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence | Feb 8, 2022 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report | Apr 5, 2024 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization | Apr 12, 2022 | Autonomous VehiclesPolicy Gradient Methods | —Unverified | 0 | 0 |
| Incremental Policy Gradients for Online Reinforcement Learning Control | Jan 1, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling | Apr 28, 2020 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| KIPPO: Koopman-Inspired Proximal Policy Optimization | May 20, 2025 | Computational Efficiencycontinuous-control | —Unverified | 0 | 0 |
| Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action | Sep 25, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior | Jul 12, 2023 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Deep Policy Gradient Methods in Commodity Markets | Jun 14, 2023 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs | May 21, 2025 | Combinatorial OptimizationPolicy Gradient Methods | —Unverified | 0 | 0 |
| Assumption Questioning: Latent Copying and Reward Exploitation in Question Generation | Sep 27, 2018 | Inductive BiasMachine Translation | —Unverified | 0 | 0 |