| Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRT | Jun 13, 2024 | BenchmarkingLLM-generated Text Detection | CodeCode Available | 1 | 5 |
| DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender Systems | Oct 30, 2024 | BenchmarkingManagement | CodeCode Available | 1 | 5 |
| Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation | Oct 11, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 | 5 |
| Analog or Digital In-memory Computing? Benchmarking through Quantitative Modeling | May 23, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks | Oct 23, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| CriticBench: Benchmarking LLMs for Critique-Correct Reasoning | Feb 22, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version) | Nov 19, 2022 | BenchmarkingC++ code | CodeCode Available | 1 | 5 |
| Benchmarking Graph Neural Networks on Dynamic Link Prediction | Sep 29, 2021 | BenchmarkingDynamic Link Prediction | CodeCode Available | 1 | 5 |
| Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning | Dec 11, 2024 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| Anabranch Network for Camouflaged Object Segmentation | May 20, 2021 | BenchmarkingCamouflaged Object Segmentation | CodeCode Available | 1 | 5 |
| Benchmarking Graph Neural Networks for FMRI analysis | Nov 16, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Language Models for Code Syntax Understanding | Oct 26, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer | Dec 2, 2021 | BenchmarkingOrdinal Classification | CodeCode Available | 1 | 5 |
| Dataset and Benchmark: Novel Sensors for Autonomous Vehicle Perception | Jan 24, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking | Feb 19, 2021 | BenchmarkingOpenAI Gym | CodeCode Available | 1 | 5 |
| Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations? | Apr 29, 2024 | Answer GenerationBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models? | Aug 14, 2023 | BenchmarkingDrug Design | CodeCode Available | 1 | 5 |
| Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization | Nov 15, 2023 | BenchmarkingInstruction Following | CodeCode Available | 1 | 5 |
| A multi-schematic classifier-independent oversampling approach for imbalanced datasets | Jul 15, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models | Nov 27, 2024 | BenchmarkingEarth Observation | CodeCode Available | 1 | 5 |
| CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks | Feb 4, 2023 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 | 5 |
| Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT | Jul 9, 2021 | BenchmarkingDocument Classification | CodeCode Available | 1 | 5 |
| AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling | Nov 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models | Aug 2, 2022 | BenchmarkingSynthetic Data Generation | CodeCode Available | 1 | 5 |
| M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection | May 16, 2025 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |