| The BrowserGym Ecosystem for Web Agent Research | Dec 6, 2024 | Benchmarking | CodeCode Available | 5 |
| Molecular-driven Foundation Model for Oncologic Pathology | Jan 28, 2025 | BenchmarkingDiagnostic | CodeCode Available | 4 |
| MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI | Oct 15, 2024 | Benchmarking | CodeCode Available | 4 |
| MTEB: Massive Text Embedding Benchmark | Oct 13, 2022 | BenchmarkingInformation Retrieval | CodeCode Available | 4 |
| Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving | Jun 6, 2024 | Autonomous DrivingBench2Drive | CodeCode Available | 4 |
| Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets | Mar 9, 2022 | BenchmarkingGraph Regression | CodeCode Available | 4 |
| LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit | May 9, 2024 | BenchmarkingComputational Efficiency | CodeCode Available | 4 |
| Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound | Feb 7, 2025 | Benchmarking | CodeCode Available | 4 |
| AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents | May 23, 2024 | Benchmarking | CodeCode Available | 4 |
| Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournaments | Jun 24, 2024 | Benchmarking | CodeCode Available | 4 |