| CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models | May 20, 2024 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models | Feb 18, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 | 0 |
| CUB: Benchmarking Context Utilisation Techniques for Language Models | May 22, 2025 | BenchmarkingFact Checking | —Unverified | 0 | 0 |
| CubeSat-Enabled Free-Space Optics: Joint Data Communication and Fine Beam Tracking | Jun 13, 2024 | Benchmarking | —Unverified | 0 | 0 |
| A Multi-View High-Resolution Foot-Ankle Complex Point Cloud Dataset During Gait for Occlusion-Robust 3D Completion | Jul 15, 2025 | BenchmarkingPoint Cloud Completion | —Unverified | 0 | 0 |
| COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Apr 14, 2025 | BenchmarkingObject | —Unverified | 0 | 0 |
| CULEMO: Cultural Lenses on Emotion -- Benchmarking LLMs for Cross-Cultural Emotion Understanding | Mar 12, 2025 | BenchmarkingEmotion Recognition | —Unverified | 0 | 0 |
| Stochastic Spiking Neural Networks with First-to-Spike Coding | Apr 26, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation | Sep 26, 2021 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models | May 19, 2025 | BenchmarkingRed Teaming | —Unverified | 0 | 0 |
| VIPPrint: A Large Scale Dataset of Printed and Scanned Images for Synthetic Face Images Detection and Source Linking | Feb 1, 2021 | BenchmarkingImage Manipulation | —Unverified | 0 | 0 |
| Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies | Nov 17, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Curriculum in Gradient-Based Meta-Reinforcement Learning | Feb 19, 2020 | BenchmarkingMeta-Learning | —Unverified | 0 | 0 |
| Curse of Slicing: Why Sliced Mutual Information is a Deceptive Measure of Statistical Dependence | Jun 4, 2025 | Benchmarking | —Unverified | 0 | 0 |
| A Multi-Task Deep Learning Approach for Sensor-based Human Activity Recognition and Segmentation | Mar 20, 2023 | Activity RecognitionBenchmarking | —Unverified | 0 | 0 |
| CoSy: Evaluating Textual Explanations of Neurons | May 30, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Stratify: Unifying Multi-Step Forecasting Strategies | Dec 29, 2024 | Benchmarking | —Unverified | 0 | 0 |
| A Multisensory Learning Architecture for Rotation-invariant Object Recognition | Sep 14, 2020 | BenchmarkingObject | —Unverified | 0 | 0 |
| A Multi-rater Comparative Study of Automatic Target Localization Methods for Epilepsy Deep Brain Stimulation Procedures | Jan 26, 2022 | Benchmarking | —Unverified | 0 | 0 |
| Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization | Feb 28, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset | Oct 1, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 | 0 |
| COSET: A Benchmark for Evaluating Neural Program Embeddings | May 27, 2019 | BenchmarkingGraph Neural Network | —Unverified | 0 | 0 |
| CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx | Jun 5, 2025 | 2D Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| Cornac: A Comparative Framework for Multimodal Recommender Systems | May 8, 2020 | BenchmarkingRecommendation Systems | —Unverified | 0 | 0 |
| CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks | Jul 3, 2025 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| DACOS-A Manually Annotated Dataset of Code Smells | Mar 15, 2023 | Benchmarking | —Unverified | 0 | 0 |
| DACSA: A large-scale Dataset for Automatic summarization of Catalan and Spanish newspaper Articles | Jul 1, 2022 | Abstractive Text SummarizationArticles | —Unverified | 0 | 0 |
| DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes | May 22, 2025 | BenchmarkingRAG | —Unverified | 0 | 0 |
| CORE: A Knowledge Graph Entity Type Prediction Method via Complex Space Regression and Embedding | Dec 19, 2021 | BenchmarkingPrediction | —Unverified | 0 | 0 |
| Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization | Feb 3, 2022 | 3D ReconstructionBenchmarking | —Unverified | 0 | 0 |
| DarkBench: Benchmarking Dark Patterns in Large Language Models | Mar 13, 2025 | Benchmarking | —Unverified | 0 | 0 |
| DASB -- Discrete Audio and Speech Benchmark | Jun 20, 2024 | BenchmarkingEmotion Recognition | —Unverified | 0 | 0 |
| Data Analysis in the Era of Generative AI | Sep 27, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Data and its (dis)contents: A survey of dataset development and use in machine learning research | Dec 9, 2020 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 | 0 |
| Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory | Aug 24, 2024 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| Data Augmentation for Traffic Classification | Jan 19, 2024 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Data Collection of Real-Life Knowledge Work in Context: The RLKWiC Dataset | Apr 16, 2024 | BenchmarkingManagement | —Unverified | 0 | 0 |
| Data-driven Approach for Static Hedging of Exchange Traded Options | Feb 1, 2023 | BenchmarkingInterpretable Machine Learning | —Unverified | 0 | 0 |
| COPA: Comparing the Incomparable to Explore the Pareto Front | Mar 18, 2025 | AutoMLBenchmarking | —Unverified | 0 | 0 |
| Data-driven inventory management for new products: An adjusted Dyna-Q approach with transfer learning | Jan 14, 2025 | BenchmarkingManagement | —Unverified | 0 | 0 |
| Data-driven Power Flow Linearization: Simulation | Jun 10, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| Data-driven surrogate modelling and benchmarking for process equipment | Mar 13, 2020 | Active LearningBenchmarking | —Unverified | 0 | 0 |
| Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound | Jan 20, 2024 | Benchmarking | —Unverified | 0 | 0 |
| A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment | Sep 21, 2017 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Convolutional and Deep Learning based techniques for Time Series Ordinal Classification | Jun 16, 2023 | BenchmarkingOrdinal Classification | —Unverified | 0 | 0 |
| Data needs and challenges for quantum dot devices automation | Dec 21, 2023 | Benchmarking | —Unverified | 0 | 0 |
| ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments | Feb 27, 2025 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| Multi-scale data reconstruction of turbulent rotating flows with Gappy POD, Extended POD and Generative Adversarial Networks | Oct 21, 2022 | BenchmarkingGenerative Adversarial Network | —Unverified | 0 | 0 |
| Dataset and Benchmarking of Real-Time Embedded Object Detection for RoboCup SSL | Jun 28, 2021 | BenchmarkingObject | —Unverified | 0 | 0 |
| ConvBench: A Comprehensive Benchmark for 2D Convolution Primitive Evaluation | Jul 15, 2024 | Benchmarking | —Unverified | 0 | 0 |