| Non-local Policy Optimization via Diversity-regularized Collaborative Exploration | Jun 14, 2020 | DiversityMuJoCo | —Unverified | 0 |
| OER: Offline Experience Replay for Continual Offline Reinforcement Learning | May 23, 2023 | Continual LearningMuJoCo | —Unverified | 0 |
| Offline Imitation Learning with a Misspecified Simulator | Dec 1, 2020 | Decision MakingFriction | —Unverified | 0 |
| Offline Multi-agent Reinforcement Learning via Score Decomposition | May 9, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Offline Robot Reinforcement Learning with Uncertainty-Guided Human Expert Sampling | Dec 16, 2022 | MuJoCoQ-Learning | —Unverified | 0 |
| Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline | May 4, 2024 | Computational EfficiencyMuJoCo | —Unverified | 0 |
| Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks | Dec 11, 2022 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| One is More: Diverse Perspectives within a Single Network for Efficient DRL | Oct 21, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| On-Policy Model Errors in Reinforcement Learning | Oct 15, 2021 | modelMuJoCo | —Unverified | 0 |
| On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling | Nov 14, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 |