| Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases | Mar 6, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 |
| A New Cervical Cytology Dataset for Nucleus Detection and Image Classification (Cervix93) and Methods for Cervical Nucleus Detection | Nov 23, 2018 | BenchmarkingCervical Nucleus Detection | CodeCode Available | 0 |
| ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures | Jun 14, 2024 | Answer GenerationBenchmarking | CodeCode Available | 0 |
| Benchmarking and Rethinking Knowledge Editing for Large Language Models | May 24, 2025 | Benchmarkingknowledge editing | CodeCode Available | 0 |
| CLEAVE: Scalable and Edge-native Benchmarking of Networked Control Systems | Apr 5, 2022 | BenchmarkingEdge-computing | CodeCode Available | 0 |
| Quantitative Metrics for Benchmarking Human-Aware Robot Navigation | Jul 26, 2023 | BenchmarkingRobot Navigation | CodeCode Available | 0 |
| Benchmarking and optimizing organism wide single-cell RNA alignment methods | Mar 26, 2025 | BenchmarkingDecoder | CodeCode Available | 0 |
| XTSC-Bench: Quantitative Benchmarking for Explainers on Time Series Classification | Oct 23, 2023 | BenchmarkingTime Series | CodeCode Available | 0 |
| CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models | Mar 6, 2025 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| Benchmarking and Improving Text-to-SQL Generation under Ambiguity | Oct 20, 2023 | BenchmarkingDiversity | CodeCode Available | 0 |
| Quantum Boosting using Domain-Partitioning Hypotheses | Oct 25, 2021 | BenchmarkingEnsemble Learning | CodeCode Available | 0 |
| TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs | May 16, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation | Apr 5, 2024 | AttributeBenchmarking | CodeCode Available | 0 |
| Multi-GPU-Enabled Hybrid Quantum-Classical Workflow in Quantum-HPC Middleware: Applications in Quantum Simulations | Mar 9, 2024 | BenchmarkingCPU | CodeCode Available | 0 |
| TDBench: Benchmarking Vision-Language Models in Understanding Top-Down Images | Apr 1, 2025 | Autonomous NavigationBenchmarking | CodeCode Available | 0 |
| A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papers | Nov 6, 2021 | BenchmarkingRetinal Vessel Segmentation | CodeCode Available | 0 |
| Adversarial Environment Generation for Learning to Navigate the Web | Mar 2, 2021 | BenchmarkingDecision Making | CodeCode Available | 0 |
| A*3D Dataset: Towards Autonomous Driving in Challenging Environments | Sep 17, 2019 | 3D Object DetectionAutonomous Driving | CodeCode Available | 0 |
| TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring | Mar 23, 2024 | BenchmarkingText to SQL | CodeCode Available | 0 |
| Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies | Mar 11, 2024 | BenchmarkingData Augmentation | CodeCode Available | 0 |
| Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample | Jan 28, 2019 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| Quaternion Capsule Networks | Jul 8, 2020 | BenchmarkingObject Recognition | CodeCode Available | 0 |
| QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results | Dec 19, 2021 | BenchmarkingBrain Tumor Segmentation | CodeCode Available | 0 |
| QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs | Dec 16, 2024 | BenchmarkingCommon Sense Reasoning | CodeCode Available | 0 |
| Question-Answering Dense Video Events | Sep 6, 2024 | BenchmarkingQuestion Answering | CodeCode Available | 0 |