SOTAVerified

Benchmarking

Papers

Showing 48414850 of 5548 papers

TitleStatusHype
Automated deep learning segmentation of high-resolution 7 T postmortem MRI for quantitative analysis of structure-pathology correlations in neurodegenerative diseasesCode0
Unmasking Societal Biases in Respiratory Support for ICU Patients through Social Determinants of HealthCode0
There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error CorrectionCode0
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic GradingCode0
SciFaultyQA: Benchmarking LLMs on Faulty Science Question Detection with a GAN-Inspired Approach to Synthetic Dataset GenerationCode0
Benchmarking Safety Monitors for Image Classifiers with Machine LearningCode0
First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher NetworkCode0
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation ModelsCode0
MOLE: Digging Tunnels Through Multimodal Multi-Objective LandscapesCode0
A Linear Constrained Optimization Benchmark For Probabilistic Search Algorithms: The Rotated Klee-Minty ProblemCode0
Show:102550
← PrevPage 485 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified