SOTAVerified

Benchmarking

Papers

Showing 831840 of 5548 papers

TitleStatusHype
A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research ChallengesCode1
Replication in Visual Diffusion Models: A Survey and OutlookCode1
AIPerf: Automated machine learning as an AI-HPC benchmarkCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
Benchmarking LLMs' Swarm intelligenceCode1
IMGTB: A Framework for Machine-Generated Text Detection BenchmarkingCode1
4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBsCode1
Can 3D Vision-Language Models Truly Understand Natural Language?Code1
Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign RecognitionCode1
EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational ScenariosCode1
Show:102550
← PrevPage 84 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified