| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| A dynamical clipping approach with task feedback for Proximal Policy Optimization | Dec 12, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy | Jul 25, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Learning Calibratable Policies using Programmatic Style-Consistency | Oct 2, 2019 | Imitation LearningMuJoCo | CodeCode Available | 0 | 5 |
| Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation | Mar 27, 2025 | MuJoCoSMAC | CodeCode Available | 0 | 5 |
| Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning | Nov 22, 2018 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Proximal Policy Distillation | Jul 21, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic | Nov 7, 2016 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills | Mar 7, 2023 | DiversityMuJoCo | CodeCode Available | 0 | 5 |
| Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp | Nov 30, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |