| Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback | Jul 17, 2025 | EEGMuJoCo | —Unverified | 0 |
| Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound | Jul 15, 2025 | counterfactualDecision Making | —Unverified | 0 |
| Deep Reinforcement Learning with Gradient Eligibility Traces | Jul 12, 2025 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Safe Domain Randomization via Uncertainty-Aware Out-of-Distribution Detection and Policy Adaptation | Jul 8, 2025 | MuJoCoOut-of-Distribution Detection | —Unverified | 0 |
| Detecting and Mitigating Reward Hacking in Reinforcement Learning Systems: A Comprehensive Empirical Study | Jul 8, 2025 | MuJoCoRecommendation Systems | —Unverified | 0 |
| Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains | Jul 2, 2025 | Atari GamesChatbot | CodeCode Available | 0 |
| rQdia: Regularizing Q-Value Distributions With Image Augmentation | Jun 26, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration | Jun 25, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| ADDQ: Adaptive Distributional Double Q-Learning | Jun 24, 2025 | Distributional Reinforcement LearningMuJoCo | CodeCode Available | 0 |