| Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution | Nov 3, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings | Oct 30, 2021 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 |
| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning | Oct 16, 2021 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Stabilizing Dynamical Systems via Policy Gradient Methods | Oct 13, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Programmatic Reinforcement Learning without Oracles | Sep 29, 2021 | Bilevel OptimizationDeep Reinforcement Learning | —Unverified | 0 |
| Variance Reduced Domain Randomization for Policy Gradient | Sep 29, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Efficient Wasserstein and Sinkhorn Policy Optimization | Sep 29, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 |
| Sample-efficient actor-critic algorithms with an etiquette for zero-sum Markov games | Sep 29, 2021 | Policy Gradient Methods | —Unverified | 0 |
| Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game | Sep 29, 2021 | counterfactualDeep Reinforcement Learning | —Unverified | 0 |