| Market making and incentives design in the presence of a dark pool: a deep reinforcement learning approach | Dec 2, 2019 | Deep Reinforcement Learning | —Unverified | 0 |
| MARNET: Backdoor Attacks against Value-Decomposition Multi-Agent Reinforcement Learning | Sep 29, 2021 | Backdoor AttackDeep Reinforcement Learning | —Unverified | 0 |
| Mask Atari for Deep Reinforcement Learning as POMDP Benchmarks | Mar 31, 2022 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Masked Generative Priors Improve World Models Sequence Modelling Capabilities | Oct 10, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Massively Scaling Explicit Policy-conditioned Value Functions | Feb 17, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Mastering Complex Control in MOBA Games with Deep Reinforcement Learning | Dec 20, 2019 | AI AgentDeep Reinforcement Learning | —Unverified | 0 |
| Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating | Feb 21, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning | Jun 30, 2022 | Board GamesDecision Making | —Unverified | 0 |
| B-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis | Oct 4, 2023 | Code GenerationDeep Reinforcement Learning | —Unverified | 0 |
| MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning | Sep 10, 2019 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Maximizing Ensemble Diversity in Deep Reinforcement Learning | Sep 29, 2021 | Atari GamesDecision Making | —Unverified | 0 |
| Maximizing the Promptness of Metaverse Systems using Edge Computing by Deep Reinforcement Learning | Jun 3, 2025 | Deep Reinforcement LearningEdge-computing | —Unverified | 0 |
| Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons | Feb 9, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Maximizing User Connectivity in AI-Enabled Multi-UAV Networks: A Distributed Strategy Generalized to Arbitrary User Distributions | Nov 7, 2024 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Maximum Correntropy Value Decomposition for Multi-agent Deep Reinforcemen Learning | Aug 7, 2022 | Deep Reinforcement LearningSMAC | —Unverified | 0 |
| MBCAL: Sample Efficient and Variance Reduced Reinforcement Learning for Recommender Systems | Nov 6, 2019 | counterfactualDeep Reinforcement Learning | —Unverified | 0 |
| Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods | Nov 4, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| A parallel-network continuous quantitative trading model with GARCH and PPO | May 8, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Mean Field Games Flock! The Reinforcement Learning Way | May 17, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning | May 29, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Fast State Stabilization using Deep Reinforcement Learning for Measurement-based Quantum Feedback Control | Aug 21, 2024 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measurement Optimization under Uncertainty using Deep Reinforcement Learning | Mar 17, 2023 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measuring and Characterizing Generalization in Deep Reinforcement Learning | Dec 7, 2018 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measuring Progress in Deep Reinforcement Learning Sample Efficiency | Feb 9, 2021 | Atari Gamescontinuous-control | —Unverified | 0 |
| Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark | Mar 29, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |