| Almost sure convergence rates of stochastic gradient methods under gradient domination | May 22, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Analysis and Improvement of Policy Gradient Estimation | Dec 1, 2011 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch | Mar 28, 2025 | Policy Gradient Methods | —Unverified | 0 | 0 |
| An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods | Nov 15, 2022 | Policy Gradient Methods | —Unverified | 0 | 0 |
| An Off-policy Policy Gradient Theorem Using Emphatic Weightings | Nov 22, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| An operator view of policy gradient methods | Jun 19, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| On Linear Convergence of Policy Gradient Methods for Finite MDPs | Jul 21, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning | May 10, 2024 | MisconceptionsMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee | Feb 11, 2023 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Approximation Benefits of Policy Gradient Methods with Aggregated States | Jul 22, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| A reinterpretation of the policy oscillation phenomenon in approximate policy iteration | Dec 1, 2011 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| A Self-Supervised Reinforcement Learning Approach for Fine-Tuning Large Language Models Using Cross-Attention Signals | Feb 14, 2025 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Assumption Questioning: Latent Copying and Reward Exploitation in Question Generation | Sep 27, 2018 | Inductive BiasMachine Translation | —Unverified | 0 | 0 |
| A Study of Policy Gradient on a Class of Exactly Solvable Models | Nov 3, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning | Sep 20, 2022 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Asynchronous Multi-Agent Actor-Critic with Macro-Actions | Sep 29, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning | Feb 22, 2018 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Augmented Bayesian Policy Search | Jul 5, 2024 | Bayesian OptimizationLEMMA | —Unverified | 0 | 0 |
| AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING | Sep 25, 2019 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| A unified view of entropy-regularized Markov decision processes | May 22, 2017 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Batch Policy Gradient Methods for Improving Neural Conversation Models | Feb 10, 2017 | ChatbotPolicy Gradient Methods | —Unverified | 0 | 0 |
| Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient | Oct 27, 2020 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts | Feb 7, 2020 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods | Oct 4, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings | Nov 30, 2024 | Bayesian OptimizationPolicy Gradient Methods | —Unverified | 0 | 0 |
| CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization | Oct 1, 2018 | Abstractive Text SummarizationImage Captioning | —Unverified | 0 | 0 |
| Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs | Feb 20, 2021 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Commodities Trading through Deep Policy Gradient Methods | Aug 10, 2023 | Algorithmic TradingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning | Dec 7, 2018 | Distributed ComputingMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Computing and Learning Stationary Mean Field Equilibria with Scalar Interactions: Algorithms and Applications | Feb 2, 2025 | counterfactualPolicy Gradient Methods | —Unverified | 0 | 0 |
| Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial | May 17, 2021 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 | 0 |
| Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching | Apr 27, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings | Oct 30, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games | Jun 15, 2022 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems | Nov 1, 2022 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 | 0 |
| Countering Language Drift via Grounding | Sep 27, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Curious Explorer: a provable exploration strategy in Policy Learning | Jun 29, 2021 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Current applications and potential future directions of reinforcement learning-based Digital Twins in agriculture | Jun 13, 2024 | Decision MakingManagement | —Unverified | 0 | 0 |
| DeepGait: Planning and Control of Quadrupedal Gaits using Deep Reinforcement Learning | Sep 18, 2019 | Deep Reinforcement LearningMotion Planning | —Unverified | 0 | 0 |
| Deep Policy Gradient Methods in Commodity Markets | Jun 14, 2023 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Deep Reinforcement Learning based Blind mmWave MIMO Beam Alignment | Jan 25, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs | Aug 19, 2024 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Difference Rewards Policy Gradients | Dec 21, 2020 | counterfactualMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Diverse Exploration via Conjugate Policies for Policy Gradient Methods | Feb 10, 2019 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Efficient Baseline-free Sampling in Parameter Exploring Policy Gradients: Super Symmetric PGPE | Dec 13, 2013 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Reinforcement Learning for Causal Discovery without Acyclicity Constraints | Aug 24, 2024 | Causal DiscoveryEfficient Exploration | —Unverified | 0 | 0 |
| Efficient Wasserstein and Sinkhorn Policy Optimization | Sep 29, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Elementary Analysis of Policy Gradient Methods | Apr 4, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |