| Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control | Mar 4, 2023 | MuJoCoQ-Learning | —Unverified | 0 |
| Wasserstein Unsupervised Reinforcement Learning | Oct 15, 2021 | Hierarchical Reinforcement LearningMuJoCo | —Unverified | 0 |
| Weighted Entropy Modification for Soft Actor-Critic | Nov 18, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator | Sep 28, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Provably Robust Blackbox Optimization for Reinforcement Learning | Mar 7, 2019 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning | Sep 8, 2021 | Adversarial Attackcontinuous-control | —Unverified | 0 |
| Yes, Q-learning Helps Offline In-Context RL | Feb 24, 2025 | In-Context Reinforcement LearningMuJoCo | —Unverified | 0 |
| LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models | May 21, 2025 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Low-Rank Agent-Specific Adaptation (LoRASA) for Multi-Agent Policy Learning | Feb 8, 2025 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| Lyceum: An efficient and scalable ecosystem for robot learning | Jan 21, 2020 | Model Predictive ControlMuJoCo | —Unverified | 0 |
| MANGA: Method Agnostic Neural-policy Generalization and Adaptation | Nov 19, 2019 | Imitation LearningMuJoCo | —Unverified | 0 |
| Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning | Aug 17, 2024 | Density EstimationImitation Learning | —Unverified | 0 |
| Markov flow policy -- deep MC | May 1, 2024 | MuJoCo | —Unverified | 0 |
| Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations | Sep 16, 2022 | Decision MakingImitation Learning | —Unverified | 0 |
| Maximizing Ensemble Diversity in Deep Reinforcement Learning | Sep 29, 2021 | Atari GamesDecision Making | —Unverified | 0 |
| Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation | Jul 25, 2024 | MuJoCo | —Unverified | 0 |
| Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees | Oct 4, 2022 | counterfactualImitation Learning | —Unverified | 0 |
| Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning | Jun 15, 2022 | Autonomous Drivingcontinuous-control | —Unverified | 0 |
| Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning | May 29, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents | Jun 18, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure | May 1, 2024 | Efficient ExplorationMuJoCo | —Unverified | 0 |
| MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL | May 31, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning | Apr 29, 2023 | Meta Reinforcement LearningMuJoCo | —Unverified | 0 |
| Meta-Reinforcement Learning via Exploratory Task Clustering | Feb 15, 2023 | ClusteringMeta Reinforcement Learning | —Unverified | 0 |
| Meta Reinforcement Learning with Distribution of Exploration Parameters Learned by Evolution Strategies | Dec 29, 2018 | Meta-LearningMeta Reinforcement Learning | —Unverified | 0 |