| What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator | Sep 28, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Provably Robust Blackbox Optimization for Reinforcement Learning | Mar 7, 2019 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning | Sep 8, 2021 | Adversarial Attackcontinuous-control | —Unverified | 0 | 0 |
| Yes, Q-learning Helps Offline In-Context RL | Feb 24, 2025 | In-Context Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning | May 14, 2020 | Adversarial AttackDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Inverse Reinforcement Learning with the Average Reward Criterion | May 24, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| SelfBC: Self Behavior Cloning for Offline Reinforcement Learning | Aug 4, 2024 | AttributeD4RL | —Unverified | 0 | 0 |
| SrSv: Integrating Sequential Rollouts with Sequential Value Estimation for Multi-agent Reinforcement Learning | Mar 3, 2025 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Modular Recurrence in Contextual MDPs for Universal Morphology Control | Jun 10, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Wasserstein Barycenter Soft Actor-Critic | Jun 11, 2025 | continuous-controlContinuous Control | —Unverified | 0 | 0 |