| ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities | Dec 9, 2024 | AllBenchmarking | —Unverified | 0 | 0 |
| One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision | Feb 3, 2021 | BenchmarkingFairness | —Unverified | 0 | 0 |
| Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese | May 16, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| One of these (Few) Things is Not Like the Others | May 22, 2020 | BenchmarkingFew-Shot Learning | —Unverified | 0 | 0 |
| Benchmarking Audio Deepfake Detection Robustness in Real-world Communication Scenarios | Apr 16, 2025 | Audio Deepfake DetectionBenchmarking | —Unverified | 0 | 0 |
| One-Shot Federated Learning with Classifier-Free Diffusion Models | Feb 12, 2025 | BenchmarkingDataset Generation | —Unverified | 0 | 0 |
| On Evaluation of Bangla Word Analogies | Apr 10, 2023 | BenchmarkingWord Embeddings | —Unverified | 0 | 0 |
| On Evaluation of Document Classification using RVL-CDIP | Jun 21, 2023 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking Attention Mechanisms and Consistency Regularization Semi-Supervised Learning for Post-Flood Building Damage Assessment in Satellite Images | Dec 4, 2024 | BenchmarkingBuilding Damage Assessment | —Unverified | 0 | 0 |
| On General Language Understanding | Oct 27, 2023 | BenchmarkingEthics | —Unverified | 0 | 0 |
| Benchmarking ASR Systems Based on Post-Editing Effort and Error Analysis | Jul 1, 2021 | Benchmarking | —Unverified | 0 | 0 |
| Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions | Aug 7, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 | 0 |
| Online vs Offline: A Comparative Study of First-Party and Third-Party Evaluations of Social Chatbots | Sep 12, 2024 | BenchmarkingChatbot | —Unverified | 0 | 0 |
| On loss functions and evaluation metrics for music source separation | Feb 16, 2022 | Audio Source SeparationBenchmarking | —Unverified | 0 | 0 |
| Only Time Can Tell: Discovering Temporal Data for Temporal Modeling | Jul 19, 2019 | BenchmarkingMotion Estimation | —Unverified | 0 | 0 |
| On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction | Jul 15, 2024 | Active LearningBenchmarking | —Unverified | 0 | 0 |
| An Approach to Evaluate Modeling Adequacy for Small-Signal Stability Analysis of IBR-related SSOs in Multimachine Systems | Mar 12, 2024 | Benchmarking | —Unverified | 0 | 0 |
| On Neural Inertial Classification Networks for Pedestrian Activity Recognition | Feb 23, 2025 | Activity RecognitionBenchmarking | —Unverified | 0 | 0 |
| Zero-Forcing Max-Power Beamforming for Hybrid mmWave Full-Duplex MIMO Systems | Feb 29, 2020 | Benchmarking | —Unverified | 0 | 0 |
| LAraBench: Benchmarking Arabic AI with Large Language Models | May 24, 2023 | BenchmarkingFew-Shot Learning | —Unverified | 0 | 0 |
| On quantifying and improving realism of images generated with diffusion | Sep 26, 2023 | AttributeBenchmarking | —Unverified | 0 | 0 |
| Active Evaluation Acquisition for Efficient LLM Benchmarking | Oct 8, 2024 | Benchmarking | —Unverified | 0 | 0 |
| On Symbiosis of Attribute Prediction and Semantic Segmentation | Nov 23, 2019 | AttributeBenchmarking | —Unverified | 0 | 0 |
| On the Assessment of Benchmark Suites for Algorithm Comparison | Apr 15, 2021 | Benchmarking | —Unverified | 0 | 0 |
| On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation | Jul 4, 2024 | BenchmarkingChatbot | —Unverified | 0 | 0 |