SOTAVerified

Benchmarking

Papers

Showing 41214130 of 5548 papers

TitleStatusHype
The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input0
The Forchheim Image Database for Camera Identification in the Wild0
The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech0
The Impact of Genomic Variation on Function (IGVF) Consortium0
The iNaturalist Sounds Dataset0
The Interactive Effects of Operators and Parameters to GA Performance Under Different Problem Sizes0
The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine0
The Jungle of Generative Drug Discovery: Traps, Treasures, and Ways Out0
The Karp Dataset0
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs0
Show:102550
← PrevPage 413 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified