| PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning | Jul 16, 2020 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 | 5 |
| Shapley Q-value: A Local Reward Approach to Solve Global Reward Games | Jul 11, 2019 | Multi-agent Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Hindsight Trust Region Policy Optimization | Jul 29, 2019 | Atari GamesPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Commodities Trading through Deep Policy Gradient Methods | Aug 10, 2023 | Algorithmic TradingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Fine-Grained AutoAugmentation for Multi-Label Classification | Jul 12, 2021 | ClassificationData Augmentation | —Unverified | 0 | 0 |
| An Off-policy Policy Gradient Theorem Using Emphatic Weightings | Nov 22, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control | Mar 7, 2024 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning | Nov 1, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods | Nov 15, 2022 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Momentum-Based Policy Gradient with Second-Order Information | May 17, 2022 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization | Jul 13, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs | Feb 20, 2021 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Expected Policy Gradients for Reinforcement Learning | Jan 10, 2018 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Exchangeable Input Representations for Reinforcement Learning | Mar 19, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Evolution Strategies as an Alternate Learning method for Hierarchical Reinforcement Learning | Sep 29, 2021 | Hierarchical Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization | Oct 1, 2018 | Abstractive Text SummarizationImage Captioning | —Unverified | 0 | 0 |
| BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings | Nov 30, 2024 | Bayesian OptimizationPolicy Gradient Methods | —Unverified | 0 | 0 |
| Adaptive Batch Size for Safe Policy Gradients | Dec 1, 2017 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator | Sep 17, 2020 | Imitation LearningOpenAI Gym | —Unverified | 0 | 0 |
| Federated Reinforcement Learning with Constraint Heterogeneity | May 6, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Evolutionary Policy Optimization | Apr 17, 2025 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods | Oct 4, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Optimal Rates of Convergence for Entropy Regularization in Discounted Markov Decision Processes | Jun 6, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Fingerprint Policy Optimisation for Robust Reinforcement Learning | May 27, 2018 | Bayesian OptimisationContinuous Control | —Unverified | 0 | 0 |
| Focused Hierarchical RNNs for Conditional Sequence Processing | Jun 12, 2018 | Open-Domain Question AnsweringPolicy Gradient Methods | —Unverified | 0 | 0 |
| Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch | Mar 28, 2025 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Equivalence of stochastic and deterministic policy gradients | May 29, 2025 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Equivalence Between Policy Gradients and Soft Q-Learning | Apr 21, 2017 | Policy Gradient MethodsQ-Learning | —Unverified | 0 | 0 |
| Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts | Feb 7, 2020 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods | Dec 11, 2019 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Analysis and Improvement of Policy Gradient Estimation | Dec 1, 2011 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation | Jun 9, 2023 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Entropy annealing for policy mirror descent in continuous time and space | May 30, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Entropic Risk Measure in Policy Search | Jun 21, 2019 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Enhanced DACER Algorithm with High Diffusion Efficiency | May 29, 2025 | DenoisingImitation Learning | —Unverified | 0 | 0 |
| End-to-End Neuro-Symbolic Architecture for Image-to-Image Reasoning Tasks | Jun 6, 2021 | Image ReconstructionPolicy Gradient Methods | —Unverified | 0 | 0 |
| Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient | Oct 27, 2020 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Almost sure convergence rates of stochastic gradient methods under gradient domination | May 22, 2024 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Elementary Analysis of Policy Gradient Methods | Apr 4, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Batch Policy Gradient Methods for Improving Neural Conversation Models | Feb 10, 2017 | ChatbotPolicy Gradient Methods | —Unverified | 0 | 0 |
| Efficient Wasserstein and Sinkhorn Policy Optimization | Sep 29, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Reinforcement Learning for Causal Discovery without Acyclicity Constraints | Aug 24, 2024 | Causal DiscoveryEfficient Exploration | —Unverified | 0 | 0 |
| All-Action Policy Gradient Methods: A Numerical Integration Approach | Oct 21, 2019 | Allcontinuous-control | —Unverified | 0 | 0 |
| AdaFrame: Adaptive Frame Selection for Fast Video Recognition | Nov 29, 2018 | Policy Gradient MethodsVideo Recognition | —Unverified | 0 | 0 |
| Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning | Feb 2, 2023 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition | Dec 29, 2020 | Action RecognitionPolicy Gradient Methods | —Unverified | 0 | 0 |
| Efficient Baseline-free Sampling in Parameter Exploring Policy Gradients: Super Symmetric PGPE | Dec 13, 2013 | Policy Gradient Methods | —Unverified | 0 | 0 |
| A unified view of entropy-regularized Markov decision processes | May 22, 2017 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING | Sep 25, 2019 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |