SOTAVerified

Benchmarking

Papers

Showing 181190 of 5548 papers

TitleStatusHype
Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy VideoCode2
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?Code2
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation BenchmarkCode2
An OpenMind for 3D medical vision self-supervised learningCode2
XRAG: eXamining the Core -- Benchmarking Foundational Components in Advanced Retrieval-Augmented GenerationCode2
AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous DrivingCode2
Open Universal Arabic ASR LeaderboardCode2
NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow ModelsCode2
EvalGIM: A Library for Evaluating Generative Image ModelsCode2
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
Show:102550
← PrevPage 19 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified