| Scaling laws in global corporations as a benchmarking approach to assess environmental performance | Jun 7, 2022 | BenchmarkingOpen-Ended Question Answering | —Unverified | 0 |
| A Correlation- and Mean-Aware Loss Function and Benchmarking Framework to Improve GAN-based Tabular Data Synthesis | May 27, 2024 | Benchmarking | —Unverified | 0 |
| Full-stack evaluation of Machine Learning inference workloads for RISC-V systems | May 24, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| Efficient Pauli channel estimation with logarithmic quantum memory | Sep 25, 2023 | Benchmarking | —Unverified | 0 |
| Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models | Feb 4, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection | Apr 1, 2021 | BenchmarkingSarcasm Detection | —Unverified | 0 |
| Automatic Microprocessor Performance Bug Detection | Nov 17, 2020 | Benchmarking | —Unverified | 0 |
| From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems | Jun 5, 2025 | BenchmarkingRAG | —Unverified | 0 |
| From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference | Oct 4, 2023 | BenchmarkingGPU | —Unverified | 0 |
| Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning | Nov 22, 2023 | BenchmarkingDrug Discovery | —Unverified | 0 |
| Automatic detection of passable roads after floods in remote sensed and social media data | Jan 10, 2019 | BenchmarkingTransfer Learning | —Unverified | 0 |
| From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution | Apr 9, 2024 | Benchmarking | —Unverified | 0 |
| A Line-of-Sight Channel Model for the 100-450 Gigahertz Frequency Band | Feb 12, 2020 | Benchmarking | —Unverified | 0 |
| A Continuously Growing Dataset of Sentential Paraphrases | Aug 1, 2017 | BenchmarkingParaphrase Identification | —Unverified | 0 |
| From Sound Representation to Model Robustness | Jul 27, 2020 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| FSD-10: A Dataset for Competitive Sports Content Analysis | Feb 9, 2020 | Action RecognitionBenchmarking | —Unverified | 0 |
| Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications | Feb 5, 2025 | BenchmarkingFeature Engineering | —Unverified | 0 |
| Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization | May 15, 2025 | BenchmarkingClustering | —Unverified | 0 |
| Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation | Mar 5, 2024 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| Automated Structured Radiology Report Generation | May 30, 2025 | Benchmarking | —Unverified | 0 |
| From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising | Apr 30, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions | Apr 2, 2025 | BenchmarkingSegmentation | —Unverified | 0 |
| Benchmarking the Sim-to-Real Gap in Cloth Manipulation | Oct 14, 2023 | BenchmarkingMuJoCo | —Unverified | 0 |
| Automated Machine Learning on Big Data using Stochastic Algorithm Tuning | Jul 30, 2014 | Bayesian OptimisationBenchmarking | —Unverified | 0 |
| From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future | Aug 5, 2024 | BenchmarkingCode Generation | —Unverified | 0 |