| Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation | Jun 9, 2025 | Decision MakingMuJoCo | —Unverified | 0 | 0 |
| Accelerating Inverse Reinforcement Learning with Expert Bootstrapping | Feb 4, 2024 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble | Dec 7, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Computational Model of Learning Flexible Navigation in a Maze by Layout-Conforming Replay of Place Cells | Sep 18, 2022 | MuJoCo | —Unverified | 0 | 0 |
| A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells | Apr 22, 2022 | MuJoCoOpen-Ended Question Answering | —Unverified | 0 | 0 |
| Action Redundancy in Reinforcement Learning | Feb 22, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Active Learning of Dynamics Using Prior Domain Knowledge in the Sampling Process | Mar 25, 2024 | Active LearningMuJoCo | —Unverified | 0 | 0 |
| Active Reinforcement Learning Strategies for Offline Policy Improvement | Dec 17, 2024 | Active Learningcontinuous-control | —Unverified | 0 | 0 |
| Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework | Jan 10, 2023 | Action ClassificationDecision Making | —Unverified | 0 | 0 |
| Adapting Double Q-Learning for Continuous Reinforcement Learning | Sep 25, 2023 | MuJoCoQ-Learning | —Unverified | 0 | 0 |
| Adapting World Models with Latent-State Dynamics Residuals | Apr 3, 2025 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback | Jun 20, 2023 | MuJoCoQ-Learning | —Unverified | 0 | 0 |
| Adaptive N-step Bootstrapping with Off-policy Data | Jan 1, 2021 | Atari GamesMuJoCo | —Unverified | 0 | 0 |
| Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning | May 25, 2024 | Atari GamesAutoML | —Unverified | 0 | 0 |
| Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets | Jan 1, 2021 | D4RLMuJoCo | —Unverified | 0 | 0 |
| ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning | May 29, 2025 | DenoisingMuJoCo | —Unverified | 0 | 0 |
| Adventurer: Exploration with BiGAN for Deep Reinforcement Learning | Mar 24, 2025 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Adversarial Imitation Learning via Random Search | Aug 21, 2020 | Computational EfficiencyDeep Reinforcement Learning | —Unverified | 0 | 0 |
| A Game-Theoretic Perspective of Generalization in Reinforcement Learning | Aug 7, 2022 | Few-Shot LearningMeta-Learning | —Unverified | 0 | 0 |
| A Generalized Training Approach for Multiagent Learning | Sep 27, 2019 | MuJoCo | —Unverified | 0 | 0 |
| AgentMixer: Multi-Agent Correlated Policy Factorization | Jan 16, 2024 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance | Nov 17, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A K-fold Method for Baseline Estimation in Policy Gradient Algorithms | Jan 3, 2017 | MuJoCoPolicy Gradient Methods | —Unverified | 0 | 0 |
| Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback | Jul 17, 2025 | EEGMuJoCo | —Unverified | 0 | 0 |
| A Logarithmic Barrier Method For Proximal Policy Optimization | Dec 16, 2018 | MuJoCoReinforcement Learning | —Unverified | 0 | 0 |
| ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleoperation | Feb 7, 2024 | MuJoCo | —Unverified | 0 | 0 |
| A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem | May 26, 2023 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients | Jan 17, 2018 | MuJoCoSensitivity | —Unverified | 0 | 0 |
| Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots | Nov 10, 2020 | MuJoCo | —Unverified | 0 | 0 |
| An Intelligent Social Learning-based Optimization Strategy for Black-box Robotic Control with Reinforcement Learning | Nov 11, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning | Dec 12, 2023 | MuJoCoOffline RL | —Unverified | 0 | 0 |
| A Pontryagin Perspective on Reinforcement Learning | May 28, 2024 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| A Pragmatic Look at Deep Imitation Learning | Aug 4, 2021 | Behavioural cloningD4RL | —Unverified | 0 | 0 |
| A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data | Feb 28, 2022 | MuJoCo | —Unverified | 0 | 0 |
| A Reinforcement Learning Based Controller to Minimize Forces on the Crutches of a Lower-Limb Exoskeleton | Jan 31, 2024 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| A Review of Nine Physics Engines for Reinforcement Learning Research | Jul 11, 2024 | Decision MakingMuJoCo | —Unverified | 0 | 0 |
| A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization | Mar 17, 2024 | MuJoCo | —Unverified | 0 | 0 |
| A Strategy-Oriented Bayesian Soft Actor-Critic Model | Mar 7, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis | Apr 9, 2024 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning | Sep 17, 2019 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment | Jul 26, 2019 | MuJoCoReinforcement Learning | —Unverified | 0 | 0 |
| A Unifying Framework for Causal Imitation Learning with Hidden Confounders | Feb 11, 2025 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| AutoDIME: Automatic Design of Interesting Multi-Agent Environments | Mar 4, 2022 | DiagnosticMuJoCo | —Unverified | 0 | 0 |
| Auto-Encoding Inverse Reinforcement Learning | Sep 29, 2021 | Decision MakingImitation Learning | —Unverified | 0 | 0 |
| Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization | Apr 28, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Average-Reward Reinforcement Learning with Trust Region Methods | Jun 7, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| AVG-DICE: Stationary Distribution Correction by Regression | Mar 3, 2025 | AvgMuJoCo | —Unverified | 0 | 0 |
| Backward Imitation and Forward Reinforcement Learning via Bi-directional Model Rollouts | Aug 4, 2022 | Generative Adversarial NetworkModel-based Reinforcement Learning | —Unverified | 0 | 0 |
| Balancing Constraints and Rewards with Meta-Gradient D4PG | Oct 13, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Bayesian Distributional Policy Gradients | Mar 20, 2021 | Atari GamesContrastive Learning | —Unverified | 0 | 0 |