| dMelodies: A Music Dataset for Disentanglement Learning | Jul 29, 2020 | BenchmarkingDisentanglement | CodeCode Available | 1 | 5 |
| Benchmarking the Spectrum of Agent Capabilities | Sep 14, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| Foundation Model of Electronic Medical Records for Adaptive Risk Estimation | Feb 10, 2025 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking TinyML Systems: Challenges and Direction | Mar 10, 2020 | BenchmarkingPosition | CodeCode Available | 1 | 5 |
| Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all | Oct 17, 2024 | AllBenchmarking | CodeCode Available | 1 | 5 |
| fseval: A Benchmarking Framework for Feature Selection and Feature Ranking Algorithms | Nov 23, 2022 | Automated Feature EngineeringBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset | Aug 12, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow | May 23, 2025 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Don’t be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System | Nov 1, 2021 | BenchmarkingResponse Generation | CodeCode Available | 1 | 5 |
| Formalizing Multimedia Recommendation through Multimodal Deep Learning | Sep 11, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 | 5 |