| MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception | Jan 2, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 | 0 |
| Benchmarking five global optimization approaches for nano-optical shape optimization and parameter reconstruction | Sep 18, 2018 | Bayesian OptimizationBenchmarking | —Unverified | 0 | 0 |
| MS MARCO: Benchmarking Ranking Models in the Large-Data Regime | May 9, 2021 | Benchmarking | —Unverified | 0 | 0 |
| MSQA: Benchmarking LLMs on Graduate-Level Materials Science Reasoning and Knowledge | May 29, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Towards Robust and Generalizable Gerchberg Saxton based Physics Inspired Neural Networks for Computer Generated Holography: A Sensitivity Analysis Framework | Apr 30, 2025 | BenchmarkingLearning Theory | —Unverified | 0 | 0 |
| Benchmarking federated strategies in Peer-to-Peer Federated learning for biomedical data | Feb 15, 2024 | BenchmarkingFederated Learning | —Unverified | 0 | 0 |
| MTG: A Benchmarking Suite for Multilingual Text Generation | Oct 16, 2021 | BenchmarkingQuestion Generation | —Unverified | 0 | 0 |
| Benchmarking Federated Machine Unlearning methods for Tabular Data | Apr 1, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| MTLens: Machine Translation Output Debugging | Jun 1, 2022 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark | Aug 21, 2020 | BenchmarkingSemantic Parsing | —Unverified | 0 | 0 |
| Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models | Jun 19, 2024 | BenchmarkingOpen-Domain Question Answering | —Unverified | 0 | 0 |
| Benchmarking FedAvg and FedCurv for Image Classification Tasks | Mar 31, 2023 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models | May 16, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA | Jan 29, 2024 | BenchmarkingImage Comprehension | —Unverified | 0 | 0 |
| Mukayese: Turkish NLP Strikes Back | Nov 16, 2021 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| Benchmarking features from different radiomics toolkits / toolboxes using Image Biomarkers Standardization Initiative | Jun 23, 2020 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Feature Extractors for Reinforcement Learning-Based Semiconductor Defect Localization | Nov 18, 2023 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking Expressive Japanese Character Text-to-Speech with VITS and Style-BERT-VITS2 | May 22, 2025 | BenchmarkingDialogue Generation | —Unverified | 0 | 0 |
| Multicalibration for Confidence Scoring in LLMs | Apr 6, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 | 0 |
| Multi-Camera Action Dataset for Cross-Camera Action Recognition Benchmarking | Jul 21, 2016 | Action RecognitionBenchmarking | —Unverified | 0 | 0 |
| Multi-channel deep convolutional neural networks for multi-classifying thyroid disease | Mar 6, 2022 | BenchmarkingBinary Classification | —Unverified | 0 | 0 |
| Benchmarking Explanatory Models for Inertia Forecasting using Public Data of the Nordic Area | Jul 14, 2023 | BenchmarkingTime Series | —Unverified | 0 | 0 |
| Multiclass Optimal Classification Trees with SVM-splits | Nov 16, 2021 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks | Dec 21, 2023 | BenchmarkingCommunity Detection | —Unverified | 0 | 0 |
| Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models | Dec 17, 2024 | Benchmarking | —Unverified | 0 | 0 |