| SnCQA: A hardware-efficient equivariant quantum convolutional circuit architecture | Nov 23, 2022 | BenchmarkingComputational chemistry | —Unverified | 0 |
| A Look at the Evaluation Setup of the M5 Forecasting Competition | Aug 8, 2021 | BenchmarkingDecision Making | —Unverified | 0 |
| Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK | Feb 16, 2023 | BenchmarkingKnowledge Distillation | —Unverified | 0 |
| Benchmarking Unsupervised Outlier Detection with Realistic Synthetic Data | Apr 15, 2020 | BenchmarkingOutlier Detection | —Unverified | 0 |
| A Comprehensive Survey on Retrieval Methods in Recommender Systems | Jul 11, 2024 | BenchmarkingRecommendation Systems | —Unverified | 0 |
| ALOJA-ML: A Framework for Automating Characterization and Knowledge Discovery in Hadoop Deployments | Nov 6, 2015 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Benchmarking unsupervised near-duplicate image detection | Jul 3, 2019 | BenchmarkingBinary Classification | —Unverified | 0 |
| Abasy Atlas v2.2: The most comprehensive and up-to-date inventory of meta-curated, historical, bacterial regulatory networks, their completeness and system-level characterization | May 5, 2020 | Benchmarking | —Unverified | 0 |
| FunBench: Benchmarking Fundus Reading Skills of MLLMs | Mar 2, 2025 | AnatomyBenchmarking | —Unverified | 0 |
| Benchmarking Unsupervised Anomaly Detection and Localization | May 30, 2022 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning | May 19, 2025 | Benchmarking | —Unverified | 0 |
| Automating Code Adaptation for MLOps -- A Benchmarking Study on LLMs | May 10, 2024 | BenchmarkingHyperparameter Optimization | —Unverified | 0 |
| Benchmarking Uncertainty Quantification on Biosignal Classification Tasks under Dataset Shift | Dec 16, 2021 | BenchmarkingClassification | —Unverified | 0 |
| Automatic vehicle trajectory data reconstruction at scale | Dec 15, 2022 | Benchmarkingvehicle detection | —Unverified | 0 |
| ALOJA: A Framework for Benchmarking and Predictive Analytics in Big Data Deployments | Nov 6, 2015 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Functional Code Building Genetic Programming | Jun 9, 2022 | BenchmarkingProgram Synthesis | —Unverified | 0 |
| Benchmarking Ultra-Low-Power μNPUs | Mar 28, 2025 | Benchmarking | —Unverified | 0 |
| Automatic Target Recognition on Synthetic Aperture Radar Imagery: A Survey | Jul 4, 2020 | BenchmarkingSurvey | —Unverified | 0 |
| Benchmarking Ultra-High-Definition Image Super-Resolution | Jan 1, 2021 | 4k8k | —Unverified | 0 |
| Almost Equivariance via Lie Algebra Convolutions | Oct 19, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking performance, explainability, and evaluation strategies of vision-language models for surgery: Challenges and opportunities | May 16, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking Twitter Sentiment Analysis Tools | May 1, 2014 | BenchmarkingDecision Making | —Unverified | 0 |
| MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models | Jun 11, 2024 | BenchmarkingFairness | —Unverified | 0 |
| Automatic segmenting teeth in X-ray images: Trends, a novel data set, benchmarking and future perspectives | Feb 9, 2018 | BenchmarkingImage Segmentation | —Unverified | 0 |
| Benchmarking Transformers-based models on French Spoken Language Understanding tasks | Jul 19, 2022 | BenchmarkingSpoken Language Understanding | —Unverified | 0 |
| Scaling laws in global corporations as a benchmarking approach to assess environmental performance | Jun 7, 2022 | BenchmarkingOpen-Ended Question Answering | —Unverified | 0 |
| A Correlation- and Mean-Aware Loss Function and Benchmarking Framework to Improve GAN-based Tabular Data Synthesis | May 27, 2024 | Benchmarking | —Unverified | 0 |
| Full-stack evaluation of Machine Learning inference workloads for RISC-V systems | May 24, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| Efficient Pauli channel estimation with logarithmic quantum memory | Sep 25, 2023 | Benchmarking | —Unverified | 0 |
| Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models | Feb 4, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection | Apr 1, 2021 | BenchmarkingSarcasm Detection | —Unverified | 0 |
| Automatic Microprocessor Performance Bug Detection | Nov 17, 2020 | Benchmarking | —Unverified | 0 |
| From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems | Jun 5, 2025 | BenchmarkingRAG | —Unverified | 0 |
| From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference | Oct 4, 2023 | BenchmarkingGPU | —Unverified | 0 |
| Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning | Nov 22, 2023 | BenchmarkingDrug Discovery | —Unverified | 0 |
| Automatic detection of passable roads after floods in remote sensed and social media data | Jan 10, 2019 | BenchmarkingTransfer Learning | —Unverified | 0 |
| From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution | Apr 9, 2024 | Benchmarking | —Unverified | 0 |
| A Line-of-Sight Channel Model for the 100-450 Gigahertz Frequency Band | Feb 12, 2020 | Benchmarking | —Unverified | 0 |
| A Continuously Growing Dataset of Sentential Paraphrases | Aug 1, 2017 | BenchmarkingParaphrase Identification | —Unverified | 0 |
| From Sound Representation to Model Robustness | Jul 27, 2020 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| FSD-10: A Dataset for Competitive Sports Content Analysis | Feb 9, 2020 | Action RecognitionBenchmarking | —Unverified | 0 |
| Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications | Feb 5, 2025 | BenchmarkingFeature Engineering | —Unverified | 0 |
| Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization | May 15, 2025 | BenchmarkingClustering | —Unverified | 0 |
| Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation | Mar 5, 2024 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| Automated Structured Radiology Report Generation | May 30, 2025 | Benchmarking | —Unverified | 0 |
| From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising | Apr 30, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions | Apr 2, 2025 | BenchmarkingSegmentation | —Unverified | 0 |
| Benchmarking the Sim-to-Real Gap in Cloth Manipulation | Oct 14, 2023 | BenchmarkingMuJoCo | —Unverified | 0 |
| Automated Machine Learning on Big Data using Stochastic Algorithm Tuning | Jul 30, 2014 | Bayesian OptimisationBenchmarking | —Unverified | 0 |
| From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future | Aug 5, 2024 | BenchmarkingCode Generation | —Unverified | 0 |