| On the relationship between Benchmarking, Standards and Certification in Robotics and AI | Sep 21, 2023 | Benchmarking | —Unverified | 0 |
| On the Reliability and Validity of Detecting Approval of Political Actors in Tweets | Nov 1, 2020 | BenchmarkingSentiment Analysis | —Unverified | 0 |
| On the Robustness of Human-Object Interaction Detection against Distribution Shift | Jun 22, 2025 | BenchmarkingData Augmentation | —Unverified | 0 |
| On the role of benchmarking data sets and simulations in method comparison studies | Aug 2, 2022 | Benchmarking | —Unverified | 0 |
| Optimizer Benchmarking Needs to Account for Hyperparameter Tuning | Oct 25, 2019 | Benchmarking | —Unverified | 0 |
| On the Use of Quality Diversity Algorithms for The Traveling Thief Problem | Dec 16, 2021 | BenchmarkingDiversity | —Unverified | 0 |
| On the Utility of Equivariance and Symmetry Breaking in Deep Learning Architectures on Point Clouds | Jan 1, 2025 | Benchmarking | —Unverified | 0 |
| On the Value of ML Models | Dec 13, 2021 | Benchmarking | —Unverified | 0 |
| OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images | Apr 17, 2023 | 3D Pose EstimationBenchmarking | —Unverified | 0 |
| OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations | Dec 3, 2024 | BenchmarkingFace Recognition | —Unverified | 0 |
| OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Open-CD: A Comprehensive Toolbox for Change Detection | Jul 22, 2024 | BenchmarkingChange Detection | —Unverified | 0 |
| OpenContrails: Benchmarking Contrail Detection on GOES-16 ABI | Apr 4, 2023 | Benchmarking | —Unverified | 0 |
| Open Datasets for Satellite Radio Resource Control | Apr 22, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation | Apr 18, 2025 | Benchmarking | —Unverified | 0 |
| OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion | Jan 16, 2024 | Benchmarking | —Unverified | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation | Feb 25, 2025 | BenchmarkingSemantic Segmentation | —Unverified | 0 |
| Open foundation models for Azerbaijani language | Jul 2, 2024 | Benchmarking | —Unverified | 0 |
| Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs | Oct 16, 2024 | Benchmarking | —Unverified | 0 |
| Open Llama2 Model for the Lithuanian Language | Aug 23, 2024 | Benchmarkingmodel | —Unverified | 0 |
| OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning | Sep 11, 2022 | BenchmarkingClassification | —Unverified | 0 |
| Open-set object detection: towards unified problem formulation and benchmarking | Nov 8, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| OpenSiteRec: An Open Dataset for Site Recommendation | Jul 3, 2023 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| Open-Source Manually Annotated Vocal Tract Database for Automatic Segmentation from 3D MRI Using Deep Learning: Benchmarking 2D and 3D Convolutional and Transformer Networks | Jan 8, 2025 | BenchmarkingDeep Learning | —Unverified | 0 |