SOTAVerified

Benchmarking

Papers

Showing 12411250 of 5548 papers

TitleStatusHype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
HAWKS: Evolving Challenging Benchmark Sets for Cluster AnalysisCode1
Benchmarking Language Model Creativity: A Case Study on Code GenerationCode1
CLoG: Benchmarking Continual Learning of Image Generation ModelsCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality mattersCode1
HazeSpace2M: A Dataset for Haze Aware Single Image DehazingCode1
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal ReasoningCode1
Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and EfficiencyCode1
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate CampaignsCode1
Show:102550
← PrevPage 125 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified