| Data-Driven Denoising of Stationary Accelerometer Signals | Jun 13, 2022 | BenchmarkingDenoising | CodeCode Available | 1 | 5 |
| Data Generating Process to Evaluate Causal Discovery Techniques for Time Series Data | Apr 16, 2021 | BenchmarkingCausal Discovery | CodeCode Available | 1 | 5 |
| Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments | Jul 19, 2022 | BenchmarkingExperimental Design | CodeCode Available | 1 | 5 |
| Benchmarking Robustness of Machine Reading Comprehension Models | Apr 29, 2020 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 1 | 5 |
| Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments | May 8, 2025 | BenchmarkingPrompt Engineering | CodeCode Available | 1 | 5 |
| Benchmarking Robustness to Adversarial Image Obfuscations | Jan 30, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation | Oct 11, 2022 | 6D Pose Estimation6D Pose Estimation using RGB | CodeCode Available | 1 | 5 |
| MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data | Sep 29, 2023 | BenchmarkingContrastive Learning | CodeCode Available | 1 | 5 |
| NAS-Bench-101: Towards Reproducible Neural Architecture Search | Feb 25, 2019 | BenchmarkingNeural Architecture Search | CodeCode Available | 1 | 5 |
| Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets | Apr 11, 2022 | Action Triplet RecognitionBenchmarking | CodeCode Available | 1 | 5 |
| Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification | Nov 11, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 | 5 |
| Benchmarking saliency methods for chest X-ray interpretation | Oct 10, 2022 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| Beyond neural scaling laws: beating power law scaling via data pruning | Jun 29, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation | Jul 11, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Vision, Language, & Action Models on Robotic Learning Tasks | Nov 4, 2024 | Action GenerationBenchmarking | CodeCode Available | 1 | 5 |
| A framework for benchmarking class-out-of-distribution detection and its application to ImageNet | Feb 23, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 1 | 5 |
| Benchpress: A Scalable and Versatile Workflow for Benchmarking Structure Learning Algorithms | Jul 8, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Segmentation Models with Mask-Preserved Attribute Editing | Mar 2, 2024 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking | Oct 14, 2022 | BenchmarkingGPU | CodeCode Available | 1 | 5 |
| Benchmarking Self-Supervised Learning on Diverse Pathology Datasets | Dec 9, 2022 | BenchmarkingClassification | CodeCode Available | 1 | 5 |
| NEORL: NeuroEvolution Optimization with Reinforcement Learning | Dec 1, 2021 | Benchmarkingglobal-optimization | CodeCode Available | 1 | 5 |
| Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities | May 19, 2025 | Automated Theorem ProvingBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset | Aug 12, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Neural Multi-Hop Reasoning With Logical Rules on Biomedical Knowledge Graphs | Mar 18, 2021 | BenchmarkingKnowledge Graphs | CodeCode Available | 1 | 5 |
| Attention, Please! Revisiting Attentive Probing for Masked Image Modeling | Jun 11, 2025 | BenchmarkingComputational Efficiency | CodeCode Available | 1 | 5 |
| Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all | Oct 17, 2024 | AllBenchmarking | CodeCode Available | 1 | 5 |
| IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL | Jun 20, 2023 | BenchmarkingManagement | CodeCode Available | 1 | 5 |
| Benchmarking Simulation-Based Inference | Jan 12, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 | 5 |
| Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks | Dec 6, 2021 | BenchmarkingInductive logic programming | CodeCode Available | 1 | 5 |
| ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction | Apr 24, 2024 | AttributeAttribute Value Extraction | CodeCode Available | 1 | 5 |
| A Large-Scale Dataset for Benchmarking Elevator Button Segmentation and Character Recognition | Mar 16, 2021 | BenchmarkingPosition | CodeCode Available | 1 | 5 |
| Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL | Apr 28, 2020 | AllBenchmarking | CodeCode Available | 1 | 5 |
| Deep learning model solves change point detection for multiple change types | Apr 15, 2022 | BenchmarkingChange Point Detection | CodeCode Available | 1 | 5 |
| Implicit Multi-Spectral Transformer: An Lightweight and Effective Visible to Infrared Image Translation Model | Apr 10, 2024 | BenchmarkingImage-to-Image Translation | CodeCode Available | 1 | 5 |
| nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image Segmentation | Apr 15, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 | 5 |
| Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency | Jun 14, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| AudioMarkBench: Benchmarking Robustness of Audio Watermarking | Jun 11, 2024 | Benchmarkingtext-to-speech | CodeCode Available | 1 | 5 |
| Improving and Benchmarking Offline Reinforcement Learning Algorithms | Jun 1, 2023 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking TinyML Systems: Challenges and Direction | Mar 10, 2020 | BenchmarkingPosition | CodeCode Available | 1 | 5 |
| Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM | Mar 28, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| IMGTB: A Framework for Machine-Generated Text Detection Benchmarking | Nov 21, 2023 | BenchmarkingText Detection | CodeCode Available | 1 | 5 |
| Benchmarking the Spectrum of Agent Capabilities | Sep 14, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds | Apr 25, 2023 | BenchmarkingPose Estimation | CodeCode Available | 1 | 5 |
| Benchmarking Image Retrieval for Visual Localization | Nov 24, 2020 | Autonomous DrivingBenchmarking | CodeCode Available | 1 | 5 |
| ArabicaQA: A Comprehensive Dataset for Arabic Question Answering | Mar 26, 2024 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 1 | 5 |
| ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Mar 27, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection | Jun 25, 2024 | BenchmarkingPrompt Learning | CodeCode Available | 1 | 5 |
| Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions | Mar 29, 2024 | Action DetectionBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets | Dec 10, 2021 | Benchmarking | CodeCode Available | 1 | 5 |