SOTAVerified

Benchmarking

Papers

Showing 11511160 of 5548 papers

TitleStatusHype
Geoclidean: Few-Shot Generalization in Euclidean GeometryCode1
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine LearningCode1
A Comparative Attention Framework for Better Few-Shot Object Detection on Aerial ImagesCode1
Benchmarks for Deep Off-Policy EvaluationCode1
HAWKS: Evolving Challenging Benchmark Sets for Cluster AnalysisCode1
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with DataCode1
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge LearningCode1
A Closer Look at Mortality Risk Prediction from ElectrocardiogramsCode1
Benchmarking MRI Reconstruction Neural Networks on Large Public DatasetsCode1
Benchmarking Large Language Models for News SummarizationCode1
Show:102550
← PrevPage 116 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified