| MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks | May 22, 2025 | BenchmarkingSpatial Reasoning | —Unverified | 0 |
| MMSciBench: Benchmarking Language Models on Multimodal Scientific Problems | Feb 27, 2025 | BenchmarkingVisual Reasoning | —Unverified | 0 |
| MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines | Sep 19, 2024 | Benchmarking | —Unverified | 0 |
| MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents | Jun 12, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases | Jun 12, 2024 | BenchmarkingModel Compression | —Unverified | 0 |
| Model Agnostic Explainable Selective Regression via Uncertainty Estimation | Nov 15, 2023 | Benchmarkingmodel | —Unverified | 0 |
| Model-based trajectory stitching for improved behavioural cloning and its applications | Dec 8, 2022 | Behavioural cloningBenchmarking | —Unverified | 0 |
| Model-Based Underwater 6D Pose Estimation from RGB | Feb 14, 2023 | 2D Object Detection6D Pose Estimation | —Unverified | 0 |
| ModelHub.AI: Dissemination Platform for Deep Learning Models | Nov 26, 2019 | BenchmarkingDeep Learning | —Unverified | 0 |
| Model Lakes | Mar 4, 2024 | BenchmarkingManagement | —Unverified | 0 |
| Modelling Neuronal Behaviour with Time Series Regression: Recurrent Neural Networks on C. Elegans Data | Jul 1, 2021 | Benchmarkingregression | —Unverified | 0 |
| Modelling neuronal behaviour with time series regression: Recurrent Neural Networks on synthetic C. elegans data | Sep 29, 2021 | Benchmarkingregression | —Unverified | 0 |
| Modelling Regional Solar Photovoltaic Capacity in Great Britain | Feb 26, 2025 | Benchmarking | —Unverified | 0 |
| Model-predictive control and reinforcement learning in multi-energy system case studies | Apr 20, 2021 | BenchmarkingModel Predictive Control | —Unverified | 0 |
| Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities | Feb 3, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 |
| Modern CNNs for IoT Based Farms | Jul 15, 2019 | BenchmarkingCloud Computing | —Unverified | 0 |
| Modern, Efficient, and Differentiable Transport Equation Models using JAX: Applications to Population Balance Equations | Nov 1, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Modified CMA-ES Algorithm for Multi-Modal Optimization: Incorporating Niching Strategies and Dynamic Adaptation Mechanism | Jul 1, 2024 | BenchmarkingDiversity | —Unverified | 0 |
| ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models | Jun 1, 2025 | BenchmarkingRelational Reasoning | —Unverified | 0 |
| MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems | Dec 10, 2024 | BenchmarkingMixture-of-Experts | —Unverified | 0 |
| MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes | May 27, 2025 | BenchmarkingDenoising | —Unverified | 0 |
| MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler | Dec 10, 2024 | BenchmarkingExperimental Design | —Unverified | 0 |
| MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design | Nov 10, 2024 | 3D geometryBenchmarking | —Unverified | 0 |
| MOLTR: Multiple Object Localisation, Tracking, and Reconstruction from Monocular RGB Videos | Dec 9, 2020 | BenchmarkingObject | —Unverified | 0 |
| Momentum Contrastive Pre-training for Question Answering | Dec 12, 2022 | BenchmarkingContrastive Learning | —Unverified | 0 |
| MorisienMT: A Dataset for Mauritian Creole Machine Translation | Jun 6, 2022 | BenchmarkingMachine Translation | —Unverified | 0 |
| Morphing Attack Detection -- Database, Evaluation Platform and Benchmarking | Jun 11, 2020 | BenchmarkingFace Recognition | —Unverified | 0 |
| MORSE: Semantic-ally Drive-n MORpheme SEgment-er | Feb 7, 2017 | Benchmarking | —Unverified | 0 |
| MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models | Jan 6, 2025 | BenchmarkingFeature Compression | —Unverified | 0 |
| Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level | Nov 15, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Movie Description | May 12, 2016 | Benchmarking | —Unverified | 0 |
| MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning | Jun 4, 2023 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking | Dec 2, 2022 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| MozzaVID: Mozzarella Volumetric Image Dataset | Dec 6, 2024 | BenchmarkingComputed Tomography (CT) | —Unverified | 0 |
| MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning | Dec 26, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures | Feb 1, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization | Nov 16, 2021 | Benchmarkingdialogue summary | —Unverified | 0 |
| MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception | Jan 2, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| MS MARCO: Benchmarking Ranking Models in the Large-Data Regime | May 9, 2021 | Benchmarking | —Unverified | 0 |
| MSQA: Benchmarking LLMs on Graduate-Level Materials Science Reasoning and Knowledge | May 29, 2025 | Benchmarking | —Unverified | 0 |
| MTG: A Benchmarking Suite for Multilingual Text Generation | Oct 16, 2021 | BenchmarkingQuestion Generation | —Unverified | 0 |
| MTLens: Machine Translation Output Debugging | Jun 1, 2022 | BenchmarkingMachine Translation | —Unverified | 0 |
| MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark | Aug 21, 2020 | BenchmarkingSemantic Parsing | —Unverified | 0 |
| Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA | Jan 29, 2024 | BenchmarkingImage Comprehension | —Unverified | 0 |
| Mukayese: Turkish NLP Strikes Back | Nov 16, 2021 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Multicalibration for Confidence Scoring in LLMs | Apr 6, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Multi-Camera Action Dataset for Cross-Camera Action Recognition Benchmarking | Jul 21, 2016 | Action RecognitionBenchmarking | —Unverified | 0 |
| Multi-channel deep convolutional neural networks for multi-classifying thyroid disease | Mar 6, 2022 | BenchmarkingBinary Classification | —Unverified | 0 |
| Multiclass Optimal Classification Trees with SVM-splits | Nov 16, 2021 | BenchmarkingClassification | —Unverified | 0 |
| Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models | Dec 17, 2024 | Benchmarking | —Unverified | 0 |