| A Study of Policy Gradient on a Class of Exactly Solvable Models | Nov 3, 2020 | Policy Gradient Methods | —Unverified | 0 |
| Deep Reinforcement Learning based Blind mmWave MIMO Beam Alignment | Jan 25, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning | Sep 20, 2022 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs | Aug 19, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Difference Rewards Policy Gradients | Dec 21, 2020 | counterfactualMulti-agent Reinforcement Learning | —Unverified | 0 |
| Curious Explorer: a provable exploration strategy in Policy Learning | Jun 29, 2021 | Policy Gradient Methods | —Unverified | 0 |
| A reinterpretation of the policy oscillation phenomenon in approximate policy iteration | Dec 1, 2011 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Adversarial Policy Gradient for Alternating Markov Games | Jan 1, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 |
| Countering Language Drift via Grounding | Sep 27, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Correcting discount-factor mismatch in on-policy policy gradient methods | Jun 23, 2023 | OpenAI GymPolicy Gradient Methods | —Unverified | 0 |