| CIBench: Evaluating Your LLMs with a Code Interpreter Plugin | Jul 15, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective | Jul 10, 2024 | BenchmarkingDiagnostic | CodeCode Available | 1 | 5 |
| CIDEr: Consensus-based Image Description Evaluation | Nov 20, 2014 | Action RecognitionAttribute | CodeCode Available | 1 | 5 |
| Benchmarking Detection Transfer Learning with Vision Transformers | Nov 22, 2021 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| Benchmarking Deep Models for Salient Object Detection | Feb 7, 2022 | BenchmarkingObject | CodeCode Available | 1 | 5 |
| Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents | Feb 27, 2025 | Benchmarking | CodeCode Available | 1 | 5 |
| CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine Learning | Feb 20, 2024 | Atomic number classificationBenchmarking | CodeCode Available | 1 | 5 |
| An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition | Oct 17, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics | May 6, 2025 | Benchmarking | CodeCode Available | 1 | 5 |
| CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods | Aug 2, 2022 | BenchmarkingCausal Discovery | CodeCode Available | 1 | 5 |