| Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets | Dec 10, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios | Oct 31, 2024 | BenchmarkingLLM-generated Text Detection | CodeCode Available | 1 | 5 |
| IMGTB: A Framework for Machine-Generated Text Detection Benchmarking | Nov 21, 2023 | BenchmarkingText Detection | CodeCode Available | 1 | 5 |
| Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification | Jul 6, 2023 | BenchmarkingDomain Adaptation | CodeCode Available | 1 | 5 |
| A Unified Taxonomy and Multimodal Dataset for Events in Invasion Games | Aug 25, 2021 | BenchmarkingVideo Classification | CodeCode Available | 1 | 5 |
| Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle? | Sep 29, 2023 | BenchmarkingKnowledge Graph Completion | CodeCode Available | 1 | 5 |
| Benchmarking the Spectrum of Agent Capabilities | Sep 14, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| DFGC 2021: A DeepFake Game Competition | Jun 2, 2021 | BenchmarkingDeepFake Detection | CodeCode Available | 1 | 5 |
| Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning | May 30, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 1 | 5 |
| ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing | Mar 30, 2023 | AttributeBenchmarking | CodeCode Available | 1 | 5 |