| Neuroplastic Expansion in Deep Reinforcement Learning | Oct 10, 2024 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Non-local Policy Optimization via Diversity-regularized Collaborative Exploration | Jun 14, 2020 | DiversityMuJoCo | —Unverified | 0 | 0 |
| OER: Offline Experience Replay for Continual Offline Reinforcement Learning | May 23, 2023 | Continual LearningMuJoCo | —Unverified | 0 | 0 |
| Offline Imitation Learning with a Misspecified Simulator | Dec 1, 2020 | Decision MakingFriction | —Unverified | 0 | 0 |
| Offline Multi-agent Reinforcement Learning via Score Decomposition | May 9, 2025 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Offline Robot Reinforcement Learning with Uncertainty-Guided Human Expert Sampling | Dec 16, 2022 | MuJoCoQ-Learning | —Unverified | 0 | 0 |
| Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline | May 4, 2024 | Computational EfficiencyMuJoCo | —Unverified | 0 | 0 |
| Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks | Dec 11, 2022 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| One is More: Diverse Perspectives within a Single Network for Efficient DRL | Oct 21, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| On-Policy Model Errors in Reinforcement Learning | Oct 15, 2021 | modelMuJoCo | —Unverified | 0 | 0 |
| On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling | Nov 14, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| On Proximal Policy Optimization's Heavy-tailed Gradients | Feb 20, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| On Representation Complexity of Model-based and Model-free Reinforcement Learning | Oct 3, 2023 | modelMuJoCo | —Unverified | 0 | 0 |
| On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies | Sep 21, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| On the Geometry of Reinforcement Learning in Continuous State and Action Spaces | Dec 29, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| OPAC: Opportunistic Actor-Critic | Dec 11, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning | Sep 29, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments | Dec 19, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Overcoming Model Bias for Robust Offline Deep Reinforcement Learning | Aug 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Parareal with a Learned Coarse Model for Robotic Manipulation | Dec 12, 2019 | MuJoCo | —Unverified | 0 | 0 |
| Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning | Apr 22, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| PGPS : Coupling Policy Gradient with Population-based Search | Jan 1, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Phasic Diversity Optimization for Population-Based Reinforcement Learning | Mar 17, 2024 | DiversityMuJoCo | —Unverified | 0 | 0 |
| Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning | May 19, 2025 | D4RLmodel | —Unverified | 0 | 0 |
| Policy Gradient with Kernel Quadrature | Oct 23, 2023 | Causal DiscoveryMuJoCo | —Unverified | 0 | 0 |
| Policy Gradient With Serial Markov Chain Reasoning | Oct 13, 2022 | Decision MakingMuJoCo | —Unverified | 0 | 0 |
| Policy Optimization by Genetic Distillation | Nov 3, 2017 | Deep Reinforcement LearningImitation Learning | —Unverified | 0 | 0 |