SOTAVerified

Benchmarking

Papers

Showing 801810 of 5548 papers

TitleStatusHype
FORB: A Flat Object Retrieval Benchmark for Universal Image EmbeddingCode1
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery AnalysisCode1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code GenerationCode1
Benchmarking AI scientists in omics data-driven biological researchCode1
Foundation Model of Electronic Medical Records for Adaptive Risk EstimationCode1
A Dataset for Answering Time-Sensitive QuestionsCode1
Benchmarking Algorithms for Federated Domain GeneralizationCode1
Benchmarking Algorithms for Submodular Optimization Problems Using IOHProfilerCode1
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional DependenciesCode1
Show:102550
← PrevPage 81 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified