SOTAVerified

Benchmarking

Papers

Showing 271280 of 5548 papers

TitleStatusHype
Benchmarking Representations for Speech, Music, and Acoustic EventsCode2
HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and BeyondCode2
SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection MethodsCode2
Benchmarking Benchmark Leakage in Large Language ModelsCode2
LongEmbed: Extending Embedding Models for Long Context RetrievalCode2
VBR: A Vision Benchmark in RomeCode2
Revealing data leakage in protein interaction benchmarksCode2
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model PerformanceCode2
EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and BenchmarkingCode2
Are large language models superhuman chemists?Code2
Show:102550
← PrevPage 28 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified