SOTAVerified

Policy Gradient Methods

Papers

Showing 176200 of 382 papers

TitleStatusHype
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization0
Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning0
Stabilizing Dynamical Systems via Policy Gradient Methods0
Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent DesignCode1
Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game0
Efficient Wasserstein and Sinkhorn Policy Optimization0
Evolution Strategies as an Alternate Learning method for Hierarchical Reinforcement Learning0
Sample-efficient actor-critic algorithms with an etiquette for zero-sum Markov games0
Asynchronous Multi-Agent Actor-Critic with Macro-Actions0
Variance Reduced Domain Randomization for Policy Gradient0
Programmatic Reinforcement Learning without Oracles0
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods0
Learning Opinion Summarizers by Selecting Informative ReviewsCode1
A general class of surrogate functions for stable and efficient reinforcement learningCode0
Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings0
Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games0
Hindsight Value Function for Variance Reduction in Stochastic Dynamic EnvironmentCode0
Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information0
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences0
Fine-Grained AutoAugmentation for Multi-Label Classification0
Policy Gradient Methods for Distortion Risk Measures0
Curious Explorer: a provable exploration strategy in Policy Learning0
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment0
End-to-End Neuro-Symbolic Architecture for Image-to-Image Reasoning Tasks0
Ad Headline Generation using Self-Critical Masked Language Model0
Show:102550
← PrevPage 8 of 16Next →

No leaderboard results yet.