| On Proximal Policy Optimization's Heavy-tailed Gradients | Feb 20, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| On Representation Complexity of Model-based and Model-free Reinforcement Learning | Oct 3, 2023 | modelMuJoCo | —Unverified | 0 |
| On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies | Sep 21, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| On the Geometry of Reinforcement Learning in Continuous State and Action Spaces | Dec 29, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 |
| OPAC: Opportunistic Actor-Critic | Dec 11, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning | Sep 29, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 |
| OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments | Dec 19, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Overcoming Model Bias for Robust Offline Deep Reinforcement Learning | Aug 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Parareal with a Learned Coarse Model for Robotic Manipulation | Dec 12, 2019 | MuJoCo | —Unverified | 0 |
| Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning | Apr 22, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |