| Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies | Mar 14, 2023 | Decision MakingMuJoCo | CodeCode Available | 0 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 |
| Variance Penalized On-Policy and Off-Policy Actor-Critic | Feb 3, 2021 | MuJoCo | CodeCode Available | 0 |
| Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning | Nov 22, 2018 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| RIZE: Regularized Imitation Learning via Distributional Reinforcement Learning | Feb 27, 2025 | Distributional Reinforcement LearningImitation Learning | CodeCode Available | 0 |
| Structured Control Nets for Deep Reinforcement Learning | Feb 22, 2018 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 |
| Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain Randomization | Jul 29, 2022 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation | Mar 27, 2025 | MuJoCoSMAC | CodeCode Available | 0 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 |
| SUPERVISED POLICY UPDATE | May 1, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Learning Calibratable Policies using Programmatic Style-Consistency | Oct 2, 2019 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Language as an Abstraction for Hierarchical Deep Reinforcement Learning | Jun 18, 2019 | Deep Reinforcement LearningInstruction Following | CodeCode Available | 0 |
| Supervised Policy Update for Deep Reinforcement Learning | May 29, 2018 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills | Mar 7, 2023 | DiversityMuJoCo | CodeCode Available | 0 |
| Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | May 20, 2024 | Atari GamesMamba | CodeCode Available | 0 |
| Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp | Nov 30, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Imitation Learning from Purified Demonstrations | Oct 11, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 |
| Imitation Learning from Observations under Transition Model Disparity | Apr 25, 2022 | Imitation Learningmodel | CodeCode Available | 0 |
| Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space | May 20, 2024 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| MuJoCo: A physics engine for model-based control | Oct 7, 2012 | modelMuJoCo | CodeCode Available | 0 |
| Weak Human Preference Supervision For Deep Reinforcement Learning | Jul 25, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach | Oct 15, 2020 | Generative Adversarial NetworkMuJoCo | CodeCode Available | 0 |
| Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards | Oct 10, 2019 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization | Aug 13, 2023 | LEMMAMuJoCo | CodeCode Available | 0 |
| Adaptive Exploration for Data-Efficient General Value Function Evaluations | May 13, 2024 | MuJoCo | CodeCode Available | 0 |
| A Quadratic Actor Network for Model-Free Reinforcement Learning | Mar 11, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption | May 29, 2024 | modelModel-based Reinforcement Learning | CodeCode Available | 0 |
| Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model | Jun 14, 2024 | Board Gamesmodel | CodeCode Available | 0 |
| Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning | May 2, 2024 | Decision MakingMuJoCo | CodeCode Available | 0 |
| Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales | May 27, 2024 | Atari GamesMuJoCo | CodeCode Available | 0 |
| Robust Policy Gradient against Strong Data Corruption | Feb 11, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| TaSIL: Taylor Series Imitation Learning | May 30, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Handling Delay in Real-Time Reinforcement Learning | Mar 30, 2025 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation | Dec 17, 2023 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| NerveNet: Learning Structured Policy with Graph Neural Networks | Jan 1, 2018 | Benchmarkingcontinuous-control | CodeCode Available | 0 |
| Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks | Feb 5, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning | Aug 8, 2017 | Deep Reinforcement Learningmodel | CodeCode Available | 0 |
| Robust Reinforcement Learning via Adversarial training with Langevin Dynamics | Feb 14, 2020 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| ROER: Regularized Optimal Experience Replay | Jul 4, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE | Apr 3, 2021 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets | Dec 5, 2022 | D4RLMuJoCo | CodeCode Available | 0 |
| Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains | Jul 2, 2025 | Atari GamesChatbot | CodeCode Available | 0 |
| Novel Policy Seeking with Constrained Optimization | May 21, 2020 | DiversityMuJoCo | CodeCode Available | 0 |
| Continuous Control With Ensemble Deep Deterministic Policy Gradients | Nov 30, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Context-Based Soft Actor Critic for Environments with Non-stationary Dynamics | May 7, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Formal Language Constraints for Markov Decision Processes | Oct 2, 2019 | Atari GamesMuJoCo | CodeCode Available | 0 |
| Feudal Graph Reinforcement Learning | Apr 11, 2023 | Decision MakingGraph Clustering | CodeCode Available | 0 |
| TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control | Jan 1, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Offline Reinforcement Learning via Inverse Optimization | Feb 27, 2025 | Model Predictive ControlMuJoCo | CodeCode Available | 0 |
| Variational Delayed Policy Optimization | May 23, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 0 |