| MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale | Feb 19, 2020 | Benchmarking | —Unverified | 0 |
| MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems | Oct 21, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| mlr3proba: An R Package for Machine Learning in Survival Analysis | Aug 18, 2020 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets | Jun 12, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding | Oct 25, 2024 | Benchmarkingdocument understanding | —Unverified | 0 |
| MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents | Jan 15, 2025 | BenchmarkingOptical Character Recognition (OCR) | —Unverified | 0 |
| MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency | Feb 13, 2025 | BenchmarkingMath | —Unverified | 0 |
| MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models | Apr 4, 2025 | BenchmarkingImage Generation | —Unverified | 0 |
| MMInA: Benchmarking Multihop Multimodal Internet Agents | Apr 15, 2024 | Benchmarking | —Unverified | 0 |
| MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation | May 23, 2025 | Audio GenerationBenchmarking | —Unverified | 0 |