SOTAVerified

Benchmarking

Papers

Showing 27012710 of 5548 papers

TitleStatusHype
AI Matrix - Synthetic Benchmarks for DNN0
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training0
Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations0
Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks0
GANmut: Generating and Modifying Facial Expressions0
GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR0
FactLens: Benchmarking Fine-Grained Fact Verification0
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics0
FACT: Learning Governing Abstractions Behind Integer Sequences0
Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy0
Show:102550
← PrevPage 271 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified