| CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models | Nov 27, 2024 | BenchmarkingEarth Observation | CodeCode Available | 1 |
| OpenFWI: Large-Scale Multi-Structural Benchmark Datasets for Seismic Full Waveform Inversion | Nov 4, 2021 | 2kBenchmarking | CodeCode Available | 1 |
| CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks | Feb 4, 2023 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 |
| Amharic LLaMA and LLaVA: Multimodal LLMs for Low Resource Languages | Mar 11, 2024 | BenchmarkingData Augmentation | CodeCode Available | 1 |
| Benchmarking Graph Neural Networks on Dynamic Link Prediction | Sep 29, 2021 | BenchmarkingDynamic Link Prediction | CodeCode Available | 1 |
| Benchmarking Graph Neural Networks for FMRI analysis | Nov 16, 2022 | Benchmarking | CodeCode Available | 1 |
| FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models | Jan 1, 2024 | Benchmarking | CodeCode Available | 1 |
| Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models | Jul 16, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 |
| Large Scale MRI Collection and Segmentation of Cirrhotic Liver | Oct 6, 2024 | BenchmarkingDiagnostic | CodeCode Available | 1 |
| CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmark of Large Language Models in Mental Health Counseling | Jun 10, 2025 | Benchmarking | CodeCode Available | 1 |