SOTAVerified

Policy Gradient Methods

Papers

Showing 326350 of 382 papers

TitleStatusHype
Remember and Forget for Experience ReplayCode0
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous ControlCode0
Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient MethodsCode0
Shapley Q-value: A Local Reward Approach to Solve Global Reward GamesCode0
Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable ModelsCode0
The Mirage of Action-Dependent Baselines in Reinforcement LearningCode0
Matrix Low-Rank Approximation For Policy Gradient MethodsCode0
Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient ApproachCode0
MDPGT: Momentum-based Decentralized Policy Gradient TrackingCode0
Predictable Reinforcement Learning Dynamics through Entropy Rate MinimizationCode0
A Nonparametric Off-Policy Policy GradientCode0
Clipped-Objective Policy Gradients for Pessimistic Policy OptimizationCode0
Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FELCode0
Deep Reinforcement Learning Algorithm for Dynamic Pricing of Express Lanes with Multiple Access LocationsCode0
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient LearningCode0
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement LearningCode0
Momentum-Based Policy Gradient MethodsCode0
Health-Informed Policy Gradients for Multi-Agent Reinforcement LearningCode0
Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive TargetsCode0
High-Dimensional Continuous Control Using Generalized Advantage EstimationCode0
Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement LearningCode0
Hindsight policy gradientsCode0
Hindsight Trust Region Policy OptimizationCode0
Hindsight Value Function for Variance Reduction in Stochastic Dynamic EnvironmentCode0
A general class of surrogate functions for stable and efficient reinforcement learningCode0
Show:102550
← PrevPage 14 of 16Next →

No leaderboard results yet.