| MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding | Sep 10, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Mirage: Model-Agnostic Graph Distillation for Graph Classification | Oct 14, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Benchmarking Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective Optimization | Jan 18, 2022 | Benchmarking | CodeCode Available | 0 |
| Sanity Simulations for Saliency Methods | May 13, 2021 | Benchmarking | CodeCode Available | 0 |
| From Variability to Stability: Advancing RecSys Benchmarking Practices | Feb 15, 2024 | BenchmarkingCollaborative Filtering | CodeCode Available | 0 |
| ALTIS: Modernizing GPGPU Benchmarking | Jun 25, 2019 | BenchmarkingGPU | CodeCode Available | 0 |
| From raw affiliations to organization identifiers | May 12, 2025 | BenchmarkingMetadata quality | CodeCode Available | 0 |
| Automated Text-to-Table for Reasoning-Intensive Table QA: Pipeline Design and Benchmarking Insights | May 26, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation | May 23, 2025 | 3D Face ReconstructionBenchmarking | CodeCode Available | 0 |
| MixMAS: A Framework for Sampling-Based Mixer Architecture Search for Multimodal Fusion and Learning | Dec 24, 2024 | Benchmarking | CodeCode Available | 0 |
| From Past to Present: A Survey of Malicious URL Detection Techniques, Datasets and Code Repositories | Apr 23, 2025 | Benchmarking | CodeCode Available | 0 |
| The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks | Jul 18, 2022 | Benchmarking | CodeCode Available | 0 |
| MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models | Apr 7, 2024 | Benchmarkingknowledge editing | CodeCode Available | 0 |
| SATBench: Benchmarking the speed-accuracy tradeoff in object recognition by humans and dynamic neural networks | Jun 16, 2022 | BenchmarkingDynamic neural networks | CodeCode Available | 0 |
| MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios | Jun 15, 2025 | Benchmarking | CodeCode Available | 0 |
| From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning | Mar 16, 2023 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| SAWEC: Sensing-Assisted Wireless Edge Computing | Feb 15, 2024 | BenchmarkingEdge-computing | CodeCode Available | 0 |
| From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering | May 11, 2025 | BenchmarkingGeneral Knowledge | CodeCode Available | 0 |
| Vote'n'Rank: Revision of Benchmarking with Social Choice Theory | Oct 11, 2022 | BenchmarkingResult aggregation | CodeCode Available | 0 |
| AlphaZip: Neural Network-Enhanced Lossless Text Compression | Sep 23, 2024 | BenchmarkingData Compression | CodeCode Available | 0 |
| ML-Net: multi-label classification of biomedical texts with deep neural networks | Nov 13, 2018 | BenchmarkingClassification | CodeCode Available | 0 |
| From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in Histopathology | Apr 11, 2022 | BenchmarkingCancer Classification | CodeCode Available | 0 |
| mlOSP: Towards a Unified Implementation of Regression Monte Carlo Algorithms | Dec 1, 2020 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation | Apr 14, 2024 | BenchmarkingDiversity | CodeCode Available | 0 |
| MLPerf Inference Benchmark | Nov 6, 2019 | Benchmarking | CodeCode Available | 0 |