| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning | Oct 16, 2021 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Stabilizing Dynamical Systems via Policy Gradient Methods | Oct 13, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design | Oct 7, 2021 | Decision MakingPolicy Gradient Methods | CodeCode Available | 1 |
| Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game | Sep 29, 2021 | counterfactualDeep Reinforcement Learning | —Unverified | 0 |
| Efficient Wasserstein and Sinkhorn Policy Optimization | Sep 29, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Evolution Strategies as an Alternate Learning method for Hierarchical Reinforcement Learning | Sep 29, 2021 | Hierarchical Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Sample-efficient actor-critic algorithms with an etiquette for zero-sum Markov games | Sep 29, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Asynchronous Multi-Agent Actor-Critic with Macro-Actions | Sep 29, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| Variance Reduced Domain Randomization for Policy Gradient | Sep 29, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Programmatic Reinforcement Learning without Oracles | Sep 29, 2021 | Bilevel OptimizationDeep Reinforcement Learning | —Unverified | 0 |
| Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods | Sep 13, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Learning Opinion Summarizers by Selecting Informative Reviews | Sep 9, 2021 | Few-Shot LearningOpinion Summarization | CodeCode Available | 1 |
| A general class of surrogate functions for stable and efficient reinforcement learning | Aug 12, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings | Jul 28, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games | Jul 27, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment | Jul 26, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 |
| Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information | Jul 20, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences | Jul 17, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Fine-Grained AutoAugmentation for Multi-Label Classification | Jul 12, 2021 | ClassificationData Augmentation | —Unverified | 0 |
| Policy Gradient Methods for Distortion Risk Measures | Jul 9, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Curious Explorer: a provable exploration strategy in Policy Learning | Jun 29, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment | Jun 28, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| End-to-End Neuro-Symbolic Architecture for Image-to-Image Reasoning Tasks | Jun 6, 2021 | Image ReconstructionPolicy Gradient Methods | —Unverified | 0 |
| Ad Headline Generation using Self-Critical Masked Language Model | Jun 1, 2021 | Headline GenerationLanguage Modeling | —Unverified | 0 |