| The Forchheim Image Database for Camera Identification in the Wild | Nov 4, 2020 | BenchmarkingFact Checking | —Unverified | 0 | 0 |
| MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models | Jun 11, 2024 | BenchmarkingFairness | —Unverified | 0 | 0 |
| How Universal are Universal Dependencies? Exploiting Syntax for Multilingual Clause-level Sentiment Detection | May 1, 2020 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 | 0 |
| Benchmarking Transformers-based models on French Spoken Language Understanding tasks | Jul 19, 2022 | BenchmarkingSpoken Language Understanding | —Unverified | 0 | 0 |
| How well it works: Benchmarking performance of GPT models on medical natural language processing tasks | Jun 12, 2024 | Benchmarking | —Unverified | 0 | 0 |
| You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain | Jan 23, 2025 | BenchmarkingDomain Adaptation | —Unverified | 0 | 0 |
| The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech | Apr 17, 2021 | Benchmarking | —Unverified | 0 | 0 |
| The Impact of Genomic Variation on Function (IGVF) Consortium | Jul 24, 2023 | Benchmarking | —Unverified | 0 | 0 |
| A General Taylor Framework for Unifying and Revisiting Attribution Methods | May 28, 2021 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing | Feb 14, 2020 | Benchmarking | —Unverified | 0 | 0 |