| Statistically Efficient Off-Policy Policy Gradients | Feb 10, 2020 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Stein Variational Policy Gradient | Apr 7, 2017 | Bayesian Inferencecontinuous-control | —Unverified | 0 |
| Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes | Jun 13, 2023 | Meta Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Stochastic Dimension-reduced Second-order Methods for Policy Optimization | Jan 28, 2023 | Policy Gradient MethodsSecond-order methods | —Unverified | 0 |
| Stochastic first-order methods for average-reward Markov decision processes | May 11, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies | Feb 3, 2023 | Policy Gradient Methods | —Unverified | 0 |
| Stochastic Recursive Momentum for Policy Gradient Methods | Mar 9, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function | May 25, 2022 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Stochastic Variance Reduction for Policy Gradient Estimation | Oct 17, 2017 | continuous-controlContinuous Control | —Unverified | 0 |
| Strategic bidding in freight transport using deep reinforcement learning | Feb 18, 2021 | Deep Reinforcement LearningFairness | —Unverified | 0 |
| Strongly-polynomial time and validation analysis of policy gradient methods | Sep 28, 2024 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence | Oct 23, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning | May 31, 2021 | Learning TheoryMulti-agent Reinforcement Learning | —Unverified | 0 |
| Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods | Sep 13, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions | Sep 27, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Token-Efficient RL for LLM Reasoning | Apr 29, 2025 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Towards Adapting Reinforcement Learning Agents to New Tasks: Insights from Q-Values | Jul 14, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis | Mar 13, 2024 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Towards Global Optimality in Cooperative MARL with the Transformation And Distillation Framework | Jul 12, 2022 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Towards Provable Log Density Policy Gradient | Mar 3, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning | Jan 1, 2024 | Decision MakingDiversity | —Unverified | 0 |
| Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods | Aug 8, 2019 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Transfer Reward Learning for Policy Gradient-Based Text Generation | Sep 9, 2019 | Conditional Text GenerationImage Captioning | —Unverified | 0 |
| Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach | Oct 17, 2024 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Policy Gradient in Partially Observable Environments: Approximation and Convergence | Oct 18, 2018 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Understanding Early Word Learning in Situated Artificial Agents | Oct 26, 2017 | Grounded language learningPolicy Gradient Methods | —Unverified | 0 |
| Understanding Grounded Language Learning Agents | Jan 1, 2018 | Grounded language learningPolicy Gradient Methods | —Unverified | 0 |
| Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings | Jul 28, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Variance Reduced Domain Randomization for Policy Gradient | Sep 29, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization | Jun 14, 2022 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines | Mar 20, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Variance Reduction for Reinforcement Learning in Input-Driven Environments | Jul 6, 2018 | Meta-LearningMuJoCo | —Unverified | 0 |
| Variance Reduction in Actor Critic Methods (ACM) | Jul 23, 2019 | Policy Gradient Methods | —Unverified | 0 |
| When Do Off-Policy and On-Policy Policy Gradient Methods Align? | Feb 19, 2024 | Policy Gradient Methods | —Unverified | 0 |
| Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies | May 31, 2019 | DiversityPolicy Gradient Methods | —Unverified | 0 |
| Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization | Jul 13, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning | Nov 1, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Federated Reinforcement Learning with Constraint Heterogeneity | May 6, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control | Mar 7, 2024 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Fine-Grained AutoAugmentation for Multi-Label Classification | Jul 12, 2021 | ClassificationData Augmentation | —Unverified | 0 |
| Fingerprint Policy Optimisation for Robust Reinforcement Learning | May 27, 2018 | Bayesian OptimisationContinuous Control | —Unverified | 0 |
| Focused Hierarchical RNNs for Conditional Sequence Processing | Jun 12, 2018 | Open-Domain Question AnsweringPolicy Gradient Methods | —Unverified | 0 |
| f-Policy Gradients: A General Framework for Goal Conditioned RL using f-Divergences | Oct 10, 2023 | Efficient ExplorationPolicy Gradient Methods | —Unverified | 0 |
| From Imitation to Refinement -- Residual RL for Precise Assembly | Jul 23, 2024 | ChunkingPolicy Gradient Methods | —Unverified | 0 |
| GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction | Feb 17, 2020 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Geometry and convergence of natural policy gradient methods | Nov 3, 2022 | Policy Gradient Methods | —Unverified | 0 |
| Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries | Mar 15, 2024 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction | Jan 2, 2024 | MuJoCoPolicy Gradient Methods | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator | Jan 15, 2018 | continuous-controlContinuous Control | —Unverified | 0 |
| Global Convergence of Policy Gradient Methods for Linearized Control Problems | Jan 1, 2018 | continuous-controlContinuous Control | —Unverified | 0 |