| Benchmarking saliency methods for chest X-ray interpretation | Oct 10, 2022 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| Benchmarking Multi-Scene Fire and Smoke Detection | Oct 22, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Segmentation Models with Mask-Preserved Attribute Editing | Mar 2, 2024 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Self-Supervised Learning on Diverse Pathology Datasets | Dec 9, 2022 | BenchmarkingClassification | CodeCode Available | 1 | 5 |
| GraphArena: Benchmarking Large Language Models on Graph Computational Problems | Jun 29, 2024 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| GraphWorld: Fake Graphs Bring Real Insights for GNNs | Feb 28, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Natural Language Understanding Services for building Conversational Agents | Mar 13, 2019 | BenchmarkingGeneral Classification | CodeCode Available | 1 | 5 |
| Boosting Neural Image Compression for Machines Using Latent Space Masking | Dec 15, 2021 | BenchmarkingImage Compression | CodeCode Available | 1 | 5 |
| Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions | May 27, 2022 | BenchmarkingFew-Shot Image Classification | CodeCode Available | 1 | 5 |
| GNNs as Predictors of Agentic Workflow Performances | Mar 14, 2025 | BenchmarkingPosition | CodeCode Available | 1 | 5 |