SOTAVerified

Benchmarking

Papers

Showing 33763400 of 5548 papers

TitleStatusHype
Coherent Feed Forward Quantum Neural Network0
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition0
ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling TasksCode0
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA0
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation ModelsCode0
Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset0
SAM-based instance segmentation models for the automation of structural damage detection0
Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis0
Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs0
Automated legal reasoning with discretion to act using s(LAW)0
TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images0
Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding0
Benchmarking the Fairness of Image Upsampling MethodsCode0
LLpowershap: Logistic Loss-based Automated Shapley Values Feature Selection MethodCode0
Deep Neural Network Benchmarks for Selective ClassificationCode0
What the Weight?! A Unified Framework for Zero-Shot Knowledge CompositionCode0
Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trialsCode0
Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound0
Data Augmentation for Traffic Classification0
Harnessing Orthogonality to Train Low-Rank Neural NetworksCode0
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription0
OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion0
Large Language Models are Null-Shot Learners0
Show:102550
← PrevPage 136 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified