| An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models | May 23, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| CoDBench: A Critical Evaluation of Data-driven Models for Continuous Dynamical Systems | Oct 2, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis | Mar 29, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 |
| CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance | Jul 14, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| DiPCo -- Dinner Party Corpus | Sep 30, 2019 | Benchmarking | —Unverified | 0 |
| CodeCrash: Stress Testing LLM Reasoning under Structural and Semantic Perturbations | Apr 19, 2025 | Benchmarking | —Unverified | 0 |
| CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings | Jan 2, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Discovering Visual Concept Structure with Sparse and Incomplete Tags | May 30, 2017 | BenchmarkingClustering | —Unverified | 0 |
| CholecTrack20: A Multi-Perspective Tracking Dataset for Surgical Tools | Jan 1, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking ASR Systems Based on Post-Editing Effort and Error Analysis | Jul 1, 2021 | Benchmarking | —Unverified | 0 |