| Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture | Sep 3, 2024 | BenchmarkingRAG | —Unverified | 0 | 0 |
| Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning | Jun 2, 2021 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| Benchmarking Clinical Decision Support Search | Jan 29, 2018 | ArticlesBenchmarking | —Unverified | 0 | 0 |
| No Dataset Needed for Downstream Knowledge Benchmarking: Response Dispersion Inversely Correlates with Accuracy on Domain-specific QA | Aug 24, 2024 | BenchmarkingChatbot | —Unverified | 0 | 0 |
| NODDI-SH: a computational efficient NODDI extension for fODF estimation in diffusion MRI | Aug 28, 2017 | BenchmarkingDiffusion MRI | —Unverified | 0 | 0 |
| Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition | Jan 14, 2025 | Activity RecognitionBenchmarking | —Unverified | 0 | 0 |
| Node Classification Meets Link Prediction on Knowledge Graphs | Jun 14, 2021 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Nodule detection and generation on chest X-rays: NODE21 Challenge | Jan 4, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Training Transformers with Enforced Lipschitz Constants | Jul 17, 2025 | Benchmarking | —Unverified | 0 | 0 |
| NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries | Dec 14, 2024 | BenchmarkingEmbodied Question Answering | —Unverified | 0 | 0 |
| NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models | Mar 18, 2023 | Adversarial AttackBenchmarking | —Unverified | 0 | 0 |
| Noisy intermediate-scale quantum (NISQ) algorithms | Jan 21, 2021 | BenchmarkingCombinatorial Optimization | —Unverified | 0 | 0 |
| Trajectory Normalized Gradients for Distributed Optimization | Jan 24, 2019 | BenchmarkingDistributed Optimization | —Unverified | 0 | 0 |
| ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities | Oct 4, 2024 | Benchmarkingcounterfactual | —Unverified | 0 | 0 |
| InferBench: Understanding Deep Learning Inference Serving with an Automatic Benchmarking System | Nov 4, 2020 | Benchmarking | —Unverified | 0 | 0 |
| Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark | Nov 20, 2017 | BenchmarkingSentiment Analysis | —Unverified | 0 | 0 |
| Non-Reference Quality Assessment for Medical Imaging: Application to Synthetic Brain MRIs | Jul 20, 2024 | BenchmarkingDomain Adaptation | —Unverified | 0 | 0 |
| Nonstochastic Bandits with Infinitely Many Experts | Feb 9, 2021 | BenchmarkingMeta-Learning | —Unverified | 0 | 0 |
| TRAM: Benchmarking Temporal Reasoning for Large Language Models | Oct 2, 2023 | BenchmarkingFew-Shot Learning | —Unverified | 0 | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 | 0 |
| Not Every Tree Is a Forest: Benchmarking Forest Types from Satellite Remote Sensing | May 3, 2025 | BenchmarkingImage Segmentation | —Unverified | 0 | 0 |
| NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription | Jan 16, 2024 | Automatic Speech RecognitionBenchmarking | —Unverified | 0 | 0 |
| NOVA: A Benchmark for Anomaly Localization and Clinical Reasoning in Brain MRI | May 20, 2025 | Anomaly LocalizationBenchmarking | —Unverified | 0 | 0 |
| NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds | Jan 7, 2024 | Autonomous VehiclesBenchmarking | —Unverified | 0 | 0 |
| Long Short-Term Memory with Gate and State Level Fusion for Light Field-Based Face Recognition | May 11, 2019 | BenchmarkingFace Recognition | —Unverified | 0 | 0 |
| Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies | Mar 10, 2025 | BenchmarkingEthics | —Unverified | 0 | 0 |
| Novel Real-Time EMT-TS Modeling Architecture for Feeder Blackstart Simulations | Nov 19, 2021 | Benchmarking | —Unverified | 0 | 0 |
| NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics | Jun 16, 2024 | Benchmarkingde novo peptide sequencing | —Unverified | 0 | 0 |
| Now you see me: evaluating performance in long-term visual tracking | Apr 19, 2018 | BenchmarkingVisual Tracking | —Unverified | 0 | 0 |
| CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs | Sep 9, 2024 | Benchmarkingknowledge editing | —Unverified | 0 | 0 |
| N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition | Jun 5, 2023 | Arabic Speech RecognitionBenchmarking | —Unverified | 0 | 0 |
| Transactive Local Energy Markets Enable Community-Level Resource Coordination Using Individual Rewards | Mar 22, 2024 | Benchmarkingenergy management | —Unverified | 0 | 0 |
| Benchmarking Chest X-ray Diagnosis Models Across Multinational Datasets | May 21, 2025 | BenchmarkingDiagnostic | —Unverified | 0 | 0 |
| NTP : A Neural Network Topology Profiler | May 22, 2019 | BenchmarkingQuantization | —Unverified | 0 | 0 |
| Benchmarking changepoint detection algorithms on cardiac time series | Apr 16, 2024 | BenchmarkingChange Point Detection | —Unverified | 0 | 0 |
| Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions | Jun 6, 2025 | BenchmarkingState Space Models | —Unverified | 0 | 0 |
| Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models | May 18, 2023 | Benchmarking | —Unverified | 0 | 0 |
| NUMOSIM: A Synthetic Mobility Dataset with Anomaly Detection Benchmarks | Sep 4, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 | 0 |
| NuwaTS: a Foundation Model Mending Every Incomplete Time Series | May 24, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 | 0 |
| Benchmarking CFAR and CNN-based Peak Detection Algorithms in ISAC under Hardware Impairments | May 16, 2025 | BenchmarkingIntegrated sensing and communication | —Unverified | 0 | 0 |
| Benchmarking Causal Study to Interpret Large Language Models for Source Code | Aug 23, 2023 | BenchmarkingCausal Inference | —Unverified | 0 | 0 |
| Object Detection based on LIDAR Temporal Pulses using Spiking Neural Networks | Oct 29, 2018 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Benchmarking Burst Super-Resolution for Polarization Images: Noise Dataset and Analysis | Mar 24, 2025 | BenchmarkingImage Reconstruction | —Unverified | 0 | 0 |
| Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment | Aug 6, 2019 | Atari GamesBenchmarking | —Unverified | 0 | 0 |
| Benchmarking BioRelEx for Entity Tagging and Relation Extraction | May 31, 2020 | BenchmarkingRelation | —Unverified | 0 | 0 |
| Benchmarking Biopharmaceuticals Retrieval-Augmented Generation Evaluation | Apr 15, 2025 | BenchmarkingQuestion Answering | —Unverified | 0 | 0 |
| OctoPath: An OcTree Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots | Jun 2, 2021 | BenchmarkingDecoder | —Unverified | 0 | 0 |
| Benchmarking Biomedical Nested NER and Relation Extraction Models | Oct 16, 2021 | BenchmarkingNER | —Unverified | 0 | 0 |
| OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking | Jul 19, 2024 | BenchmarkingMulti-Object Tracking | —Unverified | 0 | 0 |
| Benchmarking Bias in Large Language Models during Role-Playing | Nov 1, 2024 | BenchmarkingFairness | —Unverified | 0 | 0 |