SOTAVerified

Benchmarking

Papers

Showing 711720 of 5548 papers

TitleStatusHype
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine LearningCode1
A Ladder of Causal DistancesCode1
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet DatasetsCode1
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance ImagingCode1
Atom-Level Optical Chemical Structure Recognition with Limited SupervisionCode1
dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal ProcessingCode1
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic WritingCode1
CIBench: Evaluating Your LLMs with a Code Interpreter PluginCode1
CloudEval-YAML: A Practical Benchmark for Cloud Configuration GenerationCode1
CCTV-Gun: Benchmarking Handgun Detection in CCTV ImagesCode1
Show:102550
← PrevPage 72 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified