SOTAVerified

Benchmarking

Papers

Showing 35213530 of 5548 papers

TitleStatusHype
Model Agnostic Explainable Selective Regression via Uncertainty Estimation0
Benchmarking Individual Tree Mapping with Sub-meter Imagery0
On Using Distribution-Based Compositionality Assessment to Evaluate Compositional Generalisation in Machine TranslationCode0
The Disagreement Problem in Faithfulness Metrics0
Uncertainty estimation of machine learning spatial precipitation predictions from satellite data0
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks0
Connecting the Dots: Graph Neural Network Powered Ensemble and Classification of Medical ImagesCode0
Identification of vortex in unstructured mesh with graph neural networks0
SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification0
Prompt Sketching for Large Language Models0
Show:102550
← PrevPage 353 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified