| Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets | Mar 9, 2022 | BenchmarkingGraph Regression | CodeCode Available | 4 | 5 |
| OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning | Dec 31, 2024 | BenchmarkingLogical Reasoning | CodeCode Available | 4 | 5 |
| Molecular-driven Foundation Model for Oncologic Pathology | Jan 28, 2025 | BenchmarkingDiagnostic | CodeCode Available | 4 | 5 |
| Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound | Feb 7, 2025 | Benchmarking | CodeCode Available | 4 | 5 |
| MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI | Oct 15, 2024 | Benchmarking | CodeCode Available | 4 | 5 |
| MTEB: Massive Text Embedding Benchmark | Oct 13, 2022 | BenchmarkingInformation Retrieval | CodeCode Available | 4 | 5 |
| Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournaments | Jun 24, 2024 | Benchmarking | CodeCode Available | 4 | 5 |
| Aequitas Flow: Streamlining Fair ML Experimentation | May 9, 2024 | BenchmarkingFairness | CodeCode Available | 4 | 5 |
| Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders | Dec 23, 2024 | 3D Shape ModelingBenchmarking | CodeCode Available | 4 | 5 |
| I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench | Jan 31, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 4 | 5 |