| Coagent Networks: Generalized and Scaled | May 16, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Learning Constraint Network from Demonstrations via Positive-Unlabeled Learning with Memory Replay | Jul 23, 2024 | MuJoCo | —Unverified | 0 |
| Learning Loss Landscapes in Preference Optimization | Nov 10, 2024 | MuJoCo | —Unverified | 0 |
| Generalized Maximum Entropy Reinforcement Learning via Reward Shaping | Sep 29, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Deep Reinforcement Learning for Dexterous Manipulation with Concept Networks | Sep 20, 2017 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Balancing Constraints and Rewards with Meta-Gradient D4PG | Oct 13, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Deep exploration by novelty-pursuit with maximum state entropy | Sep 25, 2019 | Efficient ExplorationMuJoCo | —Unverified | 0 |
| Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance | Nov 17, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Decorrelated Double Q-learning | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Adapting Double Q-Learning for Continuous Reinforcement Learning | Sep 25, 2023 | MuJoCoQ-Learning | —Unverified | 0 |
| AgentMixer: Multi-Agent Correlated Policy Factorization | Jan 16, 2024 | Imitation LearningMuJoCo | —Unverified | 0 |
| Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations | Mar 29, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy | May 28, 2019 | counterfactualEfficient Exploration | —Unverified | 0 |
| Learning from Good Trajectories in Offline Multi-Agent Reinforcement Learning | Nov 28, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies | Jun 12, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning | Jun 26, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Backward Imitation and Forward Reinforcement Learning via Bi-directional Model Rollouts | Aug 4, 2022 | Generative Adversarial NetworkModel-based Reinforcement Learning | —Unverified | 0 |
| Data Valuation for Offline Reinforcement Learning | May 19, 2022 | Data ValuationDeep Reinforcement Learning | —Unverified | 0 |
| A Game-Theoretic Perspective of Generalization in Reinforcement Learning | Aug 7, 2022 | Few-Shot LearningMeta-Learning | —Unverified | 0 |
| Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation | Jun 9, 2025 | Decision MakingMuJoCo | —Unverified | 0 |
| AVG-DICE: Stationary Distribution Correction by Regression | Mar 3, 2025 | AvgMuJoCo | —Unverified | 0 |
| CrossNorm: On Normalization for Off-Policy Reinforcement Learning | Sep 25, 2019 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Average-Reward Reinforcement Learning with Trust Region Methods | Jun 7, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| SrSv: Integrating Sequential Rollouts with Sequential Value Estimation for Multi-agent Reinforcement Learning | Mar 3, 2025 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| Learning from Observations Using a Single Video Demonstration and Human Feedback | Sep 29, 2019 | MuJoCo | —Unverified | 0 |
| Learning rigid-body simulators over implicit shapes for large-scale scenes and vision | May 22, 2024 | MuJoCo | —Unverified | 0 |
| Cross-Domain Imitation Learning with a Dual Structure | Jun 2, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |
| Cooperative Multi-Agent Deep Reinforcement Learning in Content Ranking Optimization | Aug 8, 2024 | Deep Reinforcement LearningInformation Retrieval | —Unverified | 0 |
| Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization | Apr 28, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Cooperative Heterogeneous Deep Reinforcement Learning | Nov 2, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Auto-Encoding Inverse Reinforcement Learning | Sep 29, 2021 | Decision MakingImitation Learning | —Unverified | 0 |
| Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling | Nov 11, 2022 | MuJoCoNavigate | —Unverified | 0 |
| AutoDIME: Automatic Design of Interesting Multi-Agent Environments | Mar 4, 2022 | DiagnosticMuJoCo | —Unverified | 0 |
| Active Reinforcement Learning Strategies for Offline Policy Improvement | Dec 17, 2024 | Active Learningcontinuous-control | —Unverified | 0 |
| A Unifying Framework for Causal Imitation Learning with Hidden Confounders | Feb 11, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework | Jan 10, 2023 | Action ClassificationDecision Making | —Unverified | 0 |
| Language to Rewards for Robotic Skill Synthesis | Jun 14, 2023 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Continuous Neural Algorithmic Planners | Nov 29, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL) | Mar 2, 2024 | Imitation LearningMuJoCo | —Unverified | 0 |
| A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment | Jul 26, 2019 | MuJoCoReinforcement Learning | —Unverified | 0 |
| Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method | Mar 22, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization | Apr 4, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience | Sep 24, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates | Oct 9, 2023 | MuJoCo | —Unverified | 0 |
| Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning | Sep 17, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Adversarial Imitation Learning via Random Search | Aug 21, 2020 | Computational EfficiencyDeep Reinforcement Learning | —Unverified | 0 |
| Imitation Learning from Video by Leveraging Proprioception | May 22, 2019 | Imitation LearningMuJoCo | —Unverified | 0 |
| Improving Context-Based Meta-Reinforcement Learning with Self-Supervised Trajectory Contrastive Learning | Mar 10, 2021 | Contrastive LearningMeta Reinforcement Learning | —Unverified | 0 |
| Continuous Control for Searching and Planning with a Learned Model | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Contextual Transformer for Offline Meta Reinforcement Learning | Nov 15, 2022 | D4RLMeta Reinforcement Learning | —Unverified | 0 |