| Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation | Feb 18, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| HINT3: Raising the bar for Intent Detection in the Wild | Sep 29, 2020 | BenchmarkingIntent Detection | CodeCode Available | 1 | 5 |
| Hopfield-Enhanced Deep Neural Networks for Artifact-Resilient Brain State Decoding | Nov 6, 2023 | BenchmarkingData Compression | CodeCode Available | 1 | 5 |
| HazeSpace2M: A Dataset for Haze Aware Single Image Dehazing | Sep 25, 2024 | BenchmarkingImage Dehazing | CodeCode Available | 1 | 5 |
| 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs | Apr 28, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| CODEMENV: Benchmarking Large Language Models on Code Migration | Jun 1, 2025 | Benchmarking | CodeCode Available | 1 | 5 |
| CodeReef: an open platform for portable MLOps, reusable automation actions and reproducible benchmarking | Jan 22, 2020 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks | Nov 25, 2024 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| AIPerf: Automated machine learning as an AI-HPC benchmark | Aug 17, 2020 | AutoMLBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency | Jun 14, 2024 | Benchmarking | CodeCode Available | 1 | 5 |