| A framework for benchmarking class-out-of-distribution detection and its application to ImageNet | Feb 23, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 1 | 5 |
| CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version) | Nov 19, 2022 | BenchmarkingC++ code | CodeCode Available | 1 | 5 |
| Mukayese: Turkish NLP Strikes Back | Mar 2, 2022 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| Benchmarking Robustness of Machine Reading Comprehension Models | Apr 29, 2020 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 1 | 5 |
| Benchmarking Robustness of Text-Image Composed Retrieval | Nov 24, 2023 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Robustness to Adversarial Image Obfuscations | Jan 30, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking the Spectrum of Agent Capabilities | Sep 14, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset | Aug 12, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Illuminating Darkness: Enhancing Real-world Low-light Scenes with Smartphone Images | Mar 10, 2025 | 4kBenchmarking | CodeCode Available | 1 | 5 |
| ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction | Apr 24, 2024 | AttributeAttribute Value Extraction | CodeCode Available | 1 | 5 |