| Benchmarking Large Multimodal Models against Common Corruptions | Jan 22, 2024 | BenchmarkingImage to text | CodeCode Available | 1 | 5 |
| Benchmarking Adversarial Patch Against Aerial Detection | Oct 30, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| dMelodies: A Music Dataset for Disentanglement Learning | Jul 29, 2020 | BenchmarkingDisentanglement | CodeCode Available | 1 | 5 |
| GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks | Mar 23, 2025 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models | Jul 16, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Benchmarking Adversarial Robustness on Image Classification | Jun 1, 2020 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 | 5 |
| Benchmarking of DL Libraries and Models on Mobile Devices | Feb 14, 2022 | BenchmarkingGPU | CodeCode Available | 1 | 5 |
| GLGENN: A Novel Parameter-Light Equivariant Neural Networks Architecture Based on Clifford Geometric Algebras | Jun 11, 2025 | Benchmarking | CodeCode Available | 1 | 5 |
| DNN+NeuroSim V2.0: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators for On-chip Training | Mar 13, 2020 | BenchmarkingQuantization | CodeCode Available | 1 | 5 |
| Does your model understand genes? A benchmark of gene properties for biological and text models | Dec 5, 2024 | BenchmarkingMulti-class Classification | CodeCode Available | 1 | 5 |