| Market making and incentives design in the presence of a dark pool: a deep reinforcement learning approach | Dec 2, 2019 | Deep Reinforcement Learning | —Unverified | 0 |
| MARNET: Backdoor Attacks against Value-Decomposition Multi-Agent Reinforcement Learning | Sep 29, 2021 | Backdoor AttackDeep Reinforcement Learning | —Unverified | 0 |
| Mask Atari for Deep Reinforcement Learning as POMDP Benchmarks | Mar 31, 2022 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Masked Generative Priors Improve World Models Sequence Modelling Capabilities | Oct 10, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Massively Scaling Explicit Policy-conditioned Value Functions | Feb 17, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Mastering Complex Control in MOBA Games with Deep Reinforcement Learning | Dec 20, 2019 | AI AgentDeep Reinforcement Learning | —Unverified | 0 |
| Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating | Feb 21, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning | Jun 30, 2022 | Board GamesDecision Making | —Unverified | 0 |
| B-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis | Oct 4, 2023 | Code GenerationDeep Reinforcement Learning | —Unverified | 0 |
| MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning | Sep 10, 2019 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Maximizing Ensemble Diversity in Deep Reinforcement Learning | Sep 29, 2021 | Atari GamesDecision Making | —Unverified | 0 |
| Maximizing the Promptness of Metaverse Systems using Edge Computing by Deep Reinforcement Learning | Jun 3, 2025 | Deep Reinforcement LearningEdge-computing | —Unverified | 0 |
| Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons | Feb 9, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Maximizing User Connectivity in AI-Enabled Multi-UAV Networks: A Distributed Strategy Generalized to Arbitrary User Distributions | Nov 7, 2024 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Maximum Correntropy Value Decomposition for Multi-agent Deep Reinforcemen Learning | Aug 7, 2022 | Deep Reinforcement LearningSMAC | —Unverified | 0 |
| MBCAL: Sample Efficient and Variance Reduced Reinforcement Learning for Recommender Systems | Nov 6, 2019 | counterfactualDeep Reinforcement Learning | —Unverified | 0 |
| Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods | Nov 4, 2020 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |
| A parallel-network continuous quantitative trading model with GARCH and PPO | May 8, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Mean Field Games Flock! The Reinforcement Learning Way | May 17, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning | May 29, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Fast State Stabilization using Deep Reinforcement Learning for Measurement-based Quantum Feedback Control | Aug 21, 2024 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measurement Optimization under Uncertainty using Deep Reinforcement Learning | Mar 17, 2023 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measuring and Characterizing Generalization in Deep Reinforcement Learning | Dec 7, 2018 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Measuring Progress in Deep Reinforcement Learning Sample Efficiency | Feb 9, 2021 | Atari Gamescontinuous-control | —Unverified | 0 |
| Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark | Mar 29, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning | Oct 19, 2024 | Deep Reinforcement LearningMixture-of-Experts | —Unverified | 0 |
| MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning | Sep 22, 2021 | Deep Reinforcement LearningGaussian Processes | —Unverified | 0 |
| Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning | Nov 24, 2019 | Deep Reinforcement LearningReinforcement Learning | —Unverified | 0 |
| Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning | Feb 18, 2019 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Meta Arcade: A Configurable Environment Suite for Meta-Learning | Dec 1, 2021 | Deep Reinforcement LearningMeta-Learning | —Unverified | 0 |
| Meta-Gradient Reinforcement Learning with an Objective Discovered Online | Jul 16, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning | Jun 12, 2018 | Active LearningDeep Reinforcement Learning | —Unverified | 0 |
| Meta-modeling game for deriving theoretical-consistent, micro-structural-based traction-separation laws via deep reinforcement learning | Oct 24, 2018 | Deep Reinforcement LearningGame of Go | —Unverified | 0 |
| Meta-operators for Enabling Parallel Planning Using Deep Reinforcement Learning | Mar 13, 2024 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Metaoptimization on a Distributed System for Deep Reinforcement Learning | Feb 7, 2019 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Meta Reinforcement Learning Approach for Adaptive Resource Optimization in O-RAN | Sep 30, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks | Jul 13, 2025 | Deep Reinforcement LearningFairness | —Unverified | 0 |
| MetaSensing: Intelligent Metasurface Assisted RF 3D Sensing by Deep Reinforcement Learning | Nov 25, 2020 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization | Sep 1, 2022 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services | Oct 25, 2024 | Deep Reinforcement LearningFederated Learning | —Unverified | 0 |
| Methodical Advice Collection and Reuse in Deep Reinforcement Learning | Apr 14, 2022 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Methods for Mitigating Uncertainty in Real-Time Operations of a Connected Microgrid | Sep 29, 2024 | Deep Reinforcement Learningenergy management | —Unverified | 0 |
| Metric-Based Imitation Learning Between Two Dissimilar Anthropomorphic Robotic Arms | Feb 25, 2020 | Deep Reinforcement LearningImitation Learning | —Unverified | 0 |
| Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals | Mar 11, 2017 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Microscopic Traffic Simulation by Cooperative Multi-agent Deep Reinforcement Learning | Mar 4, 2019 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| MIGT: Memory Instance Gated Transformer Framework for Financial Portfolio Management | Feb 11, 2025 | Deep Reinforcement LearningManagement | —Unverified | 0 |
| MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research | Sep 27, 2021 | Deep Reinforcement LearningNetHack | —Unverified | 0 |
| Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy | Nov 10, 2019 | Adversarial AttackAtari Games | —Unverified | 0 |
| Minimax Strikes Back | Dec 19, 2020 | Deep Reinforcement LearningGPU | —Unverified | 0 |
| Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning | Sep 22, 2022 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |