| Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models | Apr 1, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Precise Model Benchmarking with Only a Few Observations | Oct 7, 2024 | Benchmarkingmodel | —Unverified | 0 | 0 |
| AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels | Aug 30, 2022 | Benchmarking | —Unverified | 0 | 0 |
| Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization | May 15, 2025 | BenchmarkingClustering | —Unverified | 0 | 0 |
| Predicting credit default probabilities using machine learning techniques in the face of unequal class distributions | Jul 30, 2019 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 | 0 |
| Predicting Football Match Outcomes with eXplainable Machine Learning and the Kelly Index | Nov 28, 2022 | Benchmarking | —Unverified | 0 | 0 |
| Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling | Jun 6, 2021 | Benchmarking | —Unverified | 0 | 0 |
| Predicting the Performance of a Computing System with Deep Networks | Feb 27, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Predicting the Probability of Collision of a Satellite with Space Debris: A Bayesian Machine Learning Approach | Nov 17, 2023 | BenchmarkingCollision Avoidance | —Unverified | 0 | 0 |
| Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift | Sep 5, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks | Mar 30, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Prediction of the Influence of Navigation Scan-path on Perceived Quality of Free-Viewpoint Videos | Oct 10, 2018 | BenchmarkingVideo Quality Assessment | —Unverified | 0 | 0 |
| Predictive modelling of a novel anti-adhesion therapy to combat bacterial colonisation of burn wounds | Aug 10, 2017 | Benchmarking | —Unverified | 0 | 0 |
| Predictive Models from Quantum Computer Benchmarks | May 15, 2023 | Benchmarkingimage-classification | —Unverified | 0 | 0 |
| Auto-tuning TensorFlow Threading Model for CPU Backend | Dec 4, 2018 | BenchmarkingCPU | —Unverified | 0 | 0 |
| Prepare for Trouble and Make it Double. Supervised and Unsupervised Stacking for AnomalyBased Intrusion Detection | Feb 28, 2022 | BenchmarkingIntrusion Detection | —Unverified | 0 | 0 |
| Benchmarking Machine Reading Comprehension: A Psychological Perspective | Apr 4, 2020 | BenchmarkingMachine Reading Comprehension | —Unverified | 0 | 0 |
| UCCIX: Irish-eXcellence Large Language Model | May 13, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| Pretraining boosts out-of-domain robustness for pose estimation | Sep 24, 2019 | Animal Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| Who Said That? Benchmarking Social Media AI Detection | Oct 12, 2023 | BenchmarkingMisinformation | —Unverified | 0 | 0 |
| Principles and Guidelines for Evaluating Social Robot Navigation Algorithms | Jun 29, 2023 | BenchmarkingRobot Navigation | —Unverified | 0 | 0 |
| PRISM: Complete Online Decentralized Multi-Agent Pathfinding with Rapid Information Sharing using Motion Constraints | May 12, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Prism: Dynamic and Flexible Benchmarking of LLMs Code Generation with Monte Carlo Tree Search | Apr 7, 2025 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| Autoregressive Stochastic Clock Jitter Compensation in Analog-to-Digital Converters | May 8, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Privacy-Preserving Language Model Inference with Instance Obfuscation | Feb 13, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |