SOTAVerified

Benchmarking

Papers

Showing 35013525 of 5548 papers

TitleStatusHype
NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction0
Next-generation MRD assays: do we have the tools to evaluate them properly?0
NL2KQL: From Natural Language to Kusto Query0
Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E50
NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems0
No Dataset Needed for Downstream Knowledge Benchmarking: Response Dispersion Inversely Correlates with Accuracy on Domain-specific QA0
NODDI-SH: a computational efficient NODDI extension for fODF estimation in diffusion MRI0
Node Classification Meets Link Prediction on Knowledge Graphs0
Nodule detection and generation on chest X-rays: NODE21 Challenge0
NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries0
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models0
Noisy intermediate-scale quantum (NISQ) algorithms0
InferBench: Understanding Deep Learning Inference Serving with an Automatic Benchmarking System0
Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark0
Non-Reference Quality Assessment for Medical Imaging: Application to Synthetic Brain MRIs0
Nonstochastic Bandits with Infinitely Many Experts0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
Not Every Tree Is a Forest: Benchmarking Forest Types from Satellite Remote Sensing0
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription0
NOVA: A Benchmark for Anomaly Localization and Clinical Reasoning in Brain MRI0
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds0
Long Short-Term Memory with Gate and State Level Fusion for Light Field-Based Face Recognition0
Novel Real-Time EMT-TS Modeling Architecture for Feeder Blackstart Simulations0
NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics0
Now you see me: evaluating performance in long-term visual tracking0
Show:102550
← PrevPage 141 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified