| Maximizing Ensemble Diversity in Deep Reinforcement Learning | Sep 29, 2021 | Atari GamesDecision Making | —Unverified | 0 | 0 |
| Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation | Jul 25, 2024 | MuJoCo | —Unverified | 0 | 0 |
| Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees | Oct 4, 2022 | counterfactualImitation Learning | —Unverified | 0 | 0 |
| Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning | Jun 15, 2022 | Autonomous Drivingcontinuous-control | —Unverified | 0 | 0 |
| Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning | May 29, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents | Jun 18, 2024 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure | May 1, 2024 | Efficient ExplorationMuJoCo | —Unverified | 0 | 0 |
| MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL | May 31, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning | Apr 29, 2023 | Meta Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Meta-Reinforcement Learning via Exploratory Task Clustering | Feb 15, 2023 | ClusteringMeta Reinforcement Learning | —Unverified | 0 | 0 |
| Meta Reinforcement Learning with Distribution of Exploration Parameters Learned by Evolution Strategies | Dec 29, 2018 | Meta-LearningMeta Reinforcement Learning | —Unverified | 0 | 0 |
| Mind's Eye: Grounded Language Model Reasoning through Simulation | Oct 11, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Model-based Adversarial Imitation Learning | Dec 7, 2016 | Imitation Learningmodel | —Unverified | 0 | 0 |
| Model-Based Reward Shaping for Adversarial Inverse Reinforcement Learning in Stochastic Environments | Oct 4, 2024 | MuJoCo | —Unverified | 0 | 0 |
| Model-Invariant State Abstractions for Model-Based Reinforcement Learning | Feb 19, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning | Jan 1, 2021 | Efficient ExplorationMuJoCo | —Unverified | 0 | 0 |
| Multi-Object Grasping in the Plane | Jun 1, 2022 | MuJoCoObject | —Unverified | 0 | 0 |
| Multi-Objective Algorithms for Learning Open-Ended Robotic Problems | Nov 11, 2024 | DiversityEvolutionary Algorithms | —Unverified | 0 | 0 |
| Multi-Path Policy Optimization | Nov 11, 2019 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 | 0 |
| Multi-step Greedy Reinforcement Learning Algorithms | Oct 7, 2019 | Continuous ControlGame of Go | —Unverified | 0 | 0 |
| Multi-task Reinforcement Learning with a Planning Quasi-Metric | Feb 8, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning | Sep 11, 2019 | MuJoCoQ-Learning | —Unverified | 0 | 0 |
| NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning | Dec 21, 2018 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Neural Episodic Control with State Abstraction | Jan 27, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| Neural Population Learning beyond Symmetric Zero-sum Games | Jan 10, 2024 | MuJoCoTransfer Learning | —Unverified | 0 | 0 |