| How to Benchmark Vision Foundation Models for Semantic Segmentation? | Apr 18, 2024 | BenchmarkingDecoder | CodeCode Available | 1 | 5 |
| A framework for benchmarking clustering algorithms | Sep 20, 2022 | BenchmarkingClustering | CodeCode Available | 1 | 5 |
| "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations | Sep 28, 2021 | BenchmarkingDialogue State Tracking | CodeCode Available | 1 | 5 |
| MELTing point: Mobile Evaluation of Language Transformers | Mar 19, 2024 | BenchmarkingQuantization | CodeCode Available | 1 | 5 |
| How to Train Neural Field Representations: A Comprehensive Study and Benchmark | Dec 16, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning | Oct 12, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph | May 23, 2025 | BenchmarkingManagement | CodeCode Available | 1 | 5 |
| MetaFormer and CNN Hybrid Model for Polyp Image Segmentation | Sep 16, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 | 5 |
| Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality matters | Jul 5, 2024 | Benchmarkingvalid | CodeCode Available | 1 | 5 |
| Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification | Nov 11, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 | 5 |