| Fast Benchmarking of Asynchronous Multi-Fidelity Optimization on Zero-Cost Benchmarks | Mar 4, 2024 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection | May 1, 2022 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates | Jun 2, 2022 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Positional Encodings for GNNs and Graph Transformers | Nov 19, 2024 | Benchmarking | CodeCode Available | 0 |
| Fast and accurate alignment of long bisulfite-seq reads | Jan 6, 2014 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions | Jan 31, 2020 | BenchmarkingClassification | CodeCode Available | 0 |
| False Promises in Medical Imaging AI? Assessing Validity of Outperformance Claims | May 7, 2025 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Perturbation-based Saliency Maps for Explaining Atari Agents | Jan 18, 2021 | Atari GamesBenchmarking | CodeCode Available | 0 |
| Unsupervised Anomaly Detection in Multivariate Time Series across Heterogeneous Domains | Mar 29, 2025 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| Benchmarking person re-identification datasets and approaches for practical real-world implementations | Dec 20, 2022 | BenchmarkingPedestrian Detection | CodeCode Available | 0 |
| FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs | Dec 27, 2023 | BenchmarkingGPU | CodeCode Available | 0 |
| FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability | Jun 20, 2024 | BenchmarkingFairness | CodeCode Available | 0 |
| Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News | Apr 21, 2024 | BenchmarkingEmotion Recognition | CodeCode Available | 0 |
| Benchmarking performance of object detection under image distortions in an uncontrolled environment | Oct 28, 2022 | BenchmarkingObject | CodeCode Available | 0 |
| GUNNEL: Guided Mixup Augmentation and Multi-View Fusion for Aquatic Animal Segmentation | Dec 12, 2021 | BenchmarkingInstance Segmentation | CodeCode Available | 0 |
| Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models | May 6, 2025 | BenchmarkingImage Generation | CodeCode Available | 0 |
| Segmenting France Across Four Centuries | May 30, 2025 | BenchmarkingImage-to-Image Translation | CodeCode Available | 0 |
| Audio Explanation Synthesis with Generative Foundation Models | Oct 10, 2024 | BenchmarkingDecision Making | CodeCode Available | 0 |
| Benchmarking Tropical Cyclone Rapid Intensification with Satellite Images and Attention-based Deep Models | Sep 25, 2019 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes | Jun 3, 2025 | BenchmarkingFeature Engineering | CodeCode Available | 0 |
| Can LLMs perform structured graph reasoning? | Feb 2, 2024 | BenchmarkingNavigate | CodeCode Available | 0 |
| Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors | Mar 14, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Exploring Model-based Planning with Policy Networks | Jun 20, 2019 | Benchmarkingmodel | CodeCode Available | 0 |
| Exploring Context Generalizability in Citywide Crowd Mobility Prediction: An Analytic Framework and Benchmark | Jun 30, 2021 | BenchmarkingPrediction | CodeCode Available | 0 |
| Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test | Mar 8, 2023 | BenchmarkingTime Series | CodeCode Available | 0 |
| Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation | Jul 6, 2019 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Zero-shot generation of synthetic neurosurgical data with large language models | Feb 13, 2025 | BenchmarkingDe-identification | CodeCode Available | 0 |
| Benchmarking Pathology Foundation Models: Adaptation Strategies and Scenarios | Oct 21, 2024 | BenchmarkingFew-Shot Learning | CodeCode Available | 0 |
| Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks | Mar 6, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| Multiple Instance Learning: A Survey of Problem Characteristics and Applications | Dec 11, 2016 | BenchmarkingDocument Classification | CodeCode Available | 0 |
| Self-Adjusting Weighted Expected Improvement for Bayesian Optimization | Jun 7, 2023 | Bayesian OptimizationBenchmarking | CodeCode Available | 0 |
| Multiple Light Source Dataset for Colour Research | Aug 16, 2019 | BenchmarkingImage Segmentation | CodeCode Available | 0 |
| Experimental Analysis of Large-scale Learnable Vector Storage Compression | Nov 27, 2023 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Parameter Control Methods in Differential Evolution for Mixed-Integer Black-Box Optimization | Apr 4, 2024 | Benchmarking | CodeCode Available | 0 |
| ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions | Mar 6, 2025 | BenchmarkingHumanEval | CodeCode Available | 0 |
| Benchmarking Domain Adaptation for Chemical Processes on the Tennessee Eastman Process | Aug 22, 2023 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks | Mar 5, 2025 | Benchmarkinggraph construction | CodeCode Available | 0 |
| Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection | Aug 22, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 0 |
| exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem | Feb 11, 2025 | BenchmarkingDiversity | CodeCode Available | 0 |
| Benchmarking optimality of time series classification methods in distinguishing diffusions | Jan 30, 2023 | BenchmarkingGaussian Processes | CodeCode Available | 0 |
| ExEBench: Benchmarking Foundation Models on Extreme Earth Events | May 13, 2025 | BenchmarkingManagement | CodeCode Available | 0 |
| MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering | Feb 24, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| Evolving Evolutionary Algorithms with Patterns | Oct 10, 2021 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 0 |
| Semantic Hilbert Space for Text Representation Learning | Feb 26, 2019 | BenchmarkingGeneral Classification | CodeCode Available | 0 |
| A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking | Sep 9, 2018 | BenchmarkingGame Design | CodeCode Available | 0 |
| Timage -- A Robust Time Series Classification Pipeline | Sep 19, 2019 | BenchmarkingClassification | CodeCode Available | 0 |
| AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection | Feb 6, 2024 | Benchmarking | CodeCode Available | 0 |
| EvoLearner: Learning Description Logics with Evolutionary Algorithms | Nov 8, 2021 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 0 |
| Evidential Deep Learning for Uncertainty Quantification and Out-of-Distribution Detection in Jet Identification using Deep Neural Networks | Jan 10, 2025 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data | Aug 3, 2024 | BenchmarkingKnowledge Graphs | CodeCode Available | 0 |