| Benchmarking the Robustness of Quantized Models | Apr 8, 2023 | BenchmarkingQuantization | —Unverified | 0 |
| Benchmarking the Robustness of Panoptic Segmentation for Automated Driving | Feb 23, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Automated Factual Benchmarking for In-Car Conversational Systems using Large Language Models | Apr 1, 2025 | BenchmarkingConversational Question Answering | —Unverified | 0 |
| A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery | Apr 5, 2022 | Benchmarkingobject-detection | —Unverified | 0 |
| A Baseline Method for Removing Invisible Image Watermarks using Deep Image Prior | Feb 19, 2025 | BenchmarkingMisinformation | —Unverified | 0 |
| Benchmarking the Robustness of Instance Segmentation Models | Sep 2, 2021 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| Automated detection of gibbon calls from passive acoustic monitoring data using convolutional neural networks in the "torch for R" ecosystem | Jul 13, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| Generalized Conflict-directed Search for Optimal Ordering Problems | Mar 31, 2021 | BenchmarkingScheduling | —Unverified | 0 |
| Generalizing Vision-Language Models to Novel Domains: A Comprehensive Survey | Jun 23, 2025 | BenchmarkingSurvey | —Unverified | 0 |
| Alibaba’s Submission for the WMT 2020 APE Shared Task: Improving Automatic Post-Editing with Pre-trained Conditional Cross-Lingual BERT | Nov 1, 2020 | Automatic Post-EditingBenchmarking | —Unverified | 0 |
| Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance | Mar 23, 2023 | BenchmarkingData Augmentation | —Unverified | 0 |
| Benchmarking the rationality of AI decision making using the transitivity axiom | Feb 14, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN) | Nov 23, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| Generalization, Mayhems and Limits in Recurrent Proximal Policy Optimization | May 23, 2022 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 |
| Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection | Apr 11, 2023 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| AutoLay: Benchmarking amodal layout estimation for autonomous driving | Aug 20, 2021 | Amodal Layout EstimationAutonomous Driving | —Unverified | 0 |
| Benchmarking the Neural Linear Model for Regression | Dec 18, 2019 | Bayesian OptimizationBenchmarking | —Unverified | 0 |
| Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model | Jan 20, 2025 | Benchmarking | —Unverified | 0 |
| Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow | Feb 14, 2025 | Benchmarking | —Unverified | 0 |
| General Scales Unlock AI Evaluation with Explanatory and Predictive Power | Mar 9, 2025 | BenchmarkingSpecificity | —Unverified | 0 |
| Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges | Jun 27, 2024 | BenchmarkingClinical Knowledge | —Unverified | 0 |
| Benchmarking the Impact of Noise on Deep Learning-based Classification of Atrial Fibrillation in 12-Lead ECG | Mar 24, 2023 | Atrial Fibrillation DetectionBenchmarking | —Unverified | 0 |
| Benchmarking the human brain against computational architectures | May 15, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| A Conformance Checking-based Approach for Drift Detection in Business Processes | Jul 9, 2019 | BenchmarkingDrift Detection | —Unverified | 0 |
| GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases | May 25, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| AutoAI-TS: AutoAI for Time Series Forecasting | Feb 24, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Benchmarking the Gerchberg-Saxton Algorithm | May 18, 2020 | Benchmarking | —Unverified | 0 |
| ALdataset: a benchmark for pool-based active learning | Oct 16, 2020 | Active LearningBenchmarking | —Unverified | 0 |
| Benchmarking the Fidelity and Utility of Synthetic Relational Data | Oct 4, 2024 | BenchmarkingFeature Importance | —Unverified | 0 |
| GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing | Jun 30, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference | Feb 25, 2022 | BenchmarkingDimensionality Reduction | —Unverified | 0 |
| AA3DNet: Attention Augmented Real Time 3D Object Detection | Jul 26, 2021 | 3D Object DetectionAutonomous Vehicles | —Unverified | 0 |
| Benchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web | May 1, 2014 | BenchmarkingEntity Linking | —Unverified | 0 |
| Benchmarking the Effectiveness of Classification Algorithms and SVM Kernels for Dry Beans | Jul 15, 2023 | BenchmarkingDimensionality Reduction | —Unverified | 0 |
| A Computer Vision System to Localize and Classify Wastes on the Streets | Oct 31, 2017 | Benchmarking | —Unverified | 0 |
| Practical Guidelines for Cell Segmentation Models Under Optical Aberrations in Microscopy | Apr 12, 2024 | BenchmarkingCell Segmentation | —Unverified | 0 |
| Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors | Aug 15, 2024 | BenchmarkingManagement | —Unverified | 0 |
| Benchmarking the Benchmark -- Analysis of Synthetic NIDS Datasets | Apr 19, 2021 | BenchmarkingIntrusion Detection | —Unverified | 0 |
| A Universal Protocol to Benchmark Camera Calibration for Sports | Apr 15, 2024 | BenchmarkingCamera Calibration | —Unverified | 0 |
| A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration | Jun 1, 2013 | Benchmarking | —Unverified | 0 |
| A Unified Taylor Framework for Revisiting Attribution Methods | Aug 21, 2020 | BenchmarkingDecision Making | —Unverified | 0 |
| Benchmarking the Accuracy and Robustness of Feedback Alignment Algorithms | Aug 30, 2021 | Benchmarking | —Unverified | 0 |
| A Latent Fingerprint in the Wild Database | Apr 3, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge Devices | Mar 21, 2022 | BenchmarkingGPU | —Unverified | 0 |
| Benchmarking terminology building capabilities of ChatGPT on an English-Russian Fashion Corpus | Dec 4, 2024 | Benchmarking | —Unverified | 0 |
| A Unified Study of Machine Learning Explanation Evaluation Metrics | Mar 27, 2022 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Benchmarking Table Comprehension In The Wild | Dec 13, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking | May 26, 2025 | BenchmarkingOptical Flow Estimation | —Unverified | 0 |
| A Large-scale Study on Training Sample Memorization in Generative Modeling | Jan 1, 2021 | BenchmarkingMemorization | —Unverified | 0 |
| Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models | Mar 30, 2025 | BenchmarkingRelational Reasoning | —Unverified | 0 |