| NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction | May 21, 2025 | BenchmarkingHallucination | —Unverified | 0 |
| Next-generation MRD assays: do we have the tools to evaluate them properly? | Oct 31, 2023 | BenchmarkingSensitivity | —Unverified | 0 |
| NL2KQL: From Natural Language to Kusto Query | Apr 3, 2024 | BenchmarkingNatural Language Queries | —Unverified | 0 |
| Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E5 | Sep 9, 2024 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems | Mar 7, 2024 | BenchmarkingDependency Parsing | —Unverified | 0 |
| No Dataset Needed for Downstream Knowledge Benchmarking: Response Dispersion Inversely Correlates with Accuracy on Domain-specific QA | Aug 24, 2024 | BenchmarkingChatbot | —Unverified | 0 |
| NODDI-SH: a computational efficient NODDI extension for fODF estimation in diffusion MRI | Aug 28, 2017 | BenchmarkingDiffusion MRI | —Unverified | 0 |
| Node Classification Meets Link Prediction on Knowledge Graphs | Jun 14, 2021 | BenchmarkingClassification | —Unverified | 0 |
| Nodule detection and generation on chest X-rays: NODE21 Challenge | Jan 4, 2024 | Benchmarking | —Unverified | 0 |
| NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries | Dec 14, 2024 | BenchmarkingEmbodied Question Answering | —Unverified | 0 |
| NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models | Mar 18, 2023 | Adversarial AttackBenchmarking | —Unverified | 0 |
| Noisy intermediate-scale quantum (NISQ) algorithms | Jan 21, 2021 | BenchmarkingCombinatorial Optimization | —Unverified | 0 |
| InferBench: Understanding Deep Learning Inference Serving with an Automatic Benchmarking System | Nov 4, 2020 | Benchmarking | —Unverified | 0 |
| Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark | Nov 20, 2017 | BenchmarkingSentiment Analysis | —Unverified | 0 |
| Non-Reference Quality Assessment for Medical Imaging: Application to Synthetic Brain MRIs | Jul 20, 2024 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| Nonstochastic Bandits with Infinitely Many Experts | Feb 9, 2021 | BenchmarkingMeta-Learning | —Unverified | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| Not Every Tree Is a Forest: Benchmarking Forest Types from Satellite Remote Sensing | May 3, 2025 | BenchmarkingImage Segmentation | —Unverified | 0 |
| NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription | Jan 16, 2024 | Automatic Speech RecognitionBenchmarking | —Unverified | 0 |
| NOVA: A Benchmark for Anomaly Localization and Clinical Reasoning in Brain MRI | May 20, 2025 | Anomaly LocalizationBenchmarking | —Unverified | 0 |
| NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds | Jan 7, 2024 | Autonomous VehiclesBenchmarking | —Unverified | 0 |
| Long Short-Term Memory with Gate and State Level Fusion for Light Field-Based Face Recognition | May 11, 2019 | BenchmarkingFace Recognition | —Unverified | 0 |
| Novel Real-Time EMT-TS Modeling Architecture for Feeder Blackstart Simulations | Nov 19, 2021 | Benchmarking | —Unverified | 0 |
| NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics | Jun 16, 2024 | Benchmarkingde novo peptide sequencing | —Unverified | 0 |
| Now you see me: evaluating performance in long-term visual tracking | Apr 19, 2018 | BenchmarkingVisual Tracking | —Unverified | 0 |
| N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition | Jun 5, 2023 | Arabic Speech RecognitionBenchmarking | —Unverified | 0 |
| NTP : A Neural Network Topology Profiler | May 22, 2019 | BenchmarkingQuantization | —Unverified | 0 |
| Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions | Jun 6, 2025 | BenchmarkingState Space Models | —Unverified | 0 |
| Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models | May 18, 2023 | Benchmarking | —Unverified | 0 |
| NUMOSIM: A Synthetic Mobility Dataset with Anomaly Detection Benchmarks | Sep 4, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| NuwaTS: a Foundation Model Mending Every Incomplete Time Series | May 24, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Object Detection based on LIDAR Temporal Pulses using Spiking Neural Networks | Oct 29, 2018 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| OctoPath: An OcTree Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots | Jun 2, 2021 | BenchmarkingDecoder | —Unverified | 0 |
| OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking | Jul 19, 2024 | BenchmarkingMulti-Object Tracking | —Unverified | 0 |
| Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection | Jul 28, 2024 | BenchmarkingFake News Detection | —Unverified | 0 |
| Off-policy Evaluation for Payments at Adyen | Jan 15, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics | Jun 12, 2025 | Benchmarking | —Unverified | 0 |
| Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking | Jun 6, 2024 | 6D Pose Estimation using RGBBenchmarking | —Unverified | 0 |
| Omnibenchmark (alpha) for continuous and open benchmarking in bioinformatics | Sep 25, 2024 | Benchmarking | —Unverified | 0 |
| OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions | Dec 9, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB | Oct 9, 2024 | BenchmarkingDiversity | —Unverified | 0 |
| On Benchmarking Code LLMs for Android Malware Analysis | Apr 1, 2025 | BenchmarkingMalware Analysis | —Unverified | 0 |
| On Benchmarking Iris Recognition within a Head-mounted Display for AR/VR Application | Oct 20, 2020 | BenchmarkingIris Recognition | —Unverified | 0 |
| On Continual Model Refinement in Out-of-Distribution Data Streams | May 4, 2022 | BenchmarkingContinual Learning | —Unverified | 0 |
| On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events | Dec 9, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| On Distribution Grid Optimal Power Flow Development and Integration | Dec 9, 2022 | Benchmarking | —Unverified | 0 |
| ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities | Dec 9, 2024 | AllBenchmarking | —Unverified | 0 |
| One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision | Feb 3, 2021 | BenchmarkingFairness | —Unverified | 0 |
| One of these (Few) Things is Not Like the Others | May 22, 2020 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| One-Shot Federated Learning with Classifier-Free Diffusion Models | Feb 12, 2025 | BenchmarkingDataset Generation | —Unverified | 0 |