SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4841–4850 of 5548 papers

Title	Date	Tasks	Status	Hype
Automated deep learning segmentation of high-resolution 7 T postmortem MRI for quantitative analysis of structure-pathology correlations in neurodegenerative diseases	Mar 21, 2023	AnatomyBenchmarking	CodeCode Available	0
Unmasking Societal Biases in Respiratory Support for ICU Patients through Social Determinants of Health	Feb 23, 2025	BenchmarkingFairness	CodeCode Available	0
There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction	Oct 7, 2016	BenchmarkingGrammatical Error Correction	CodeCode Available	0
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading	Jun 14, 2024	BenchmarkingMathematical Proofs	CodeCode Available	0
SciFaultyQA: Benchmarking LLMs on Faulty Science Question Detection with a GAN-Inspired Approach to Synthetic Dataset Generation	Dec 16, 2024	BenchmarkingDataset Generation	CodeCode Available	0
Benchmarking Safety Monitors for Image Classifiers with Machine Learning	Oct 4, 2021	Autonomous VehiclesBenchmarking	CodeCode Available	0
First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network	Dec 21, 2024	BenchmarkingTransfer Learning	CodeCode Available	0
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models	Nov 29, 2018	BenchmarkingDiversity	CodeCode Available	0
MOLE: Digging Tunnels Through Multimodal Multi-Objective Landscapes	Apr 22, 2022	Benchmarking	CodeCode Available	0
A Linear Constrained Optimization Benchmark For Probabilistic Search Algorithms: The Rotated Klee-Minty Problem	Jul 26, 2018	BenchmarkingEvolutionary Algorithms	CodeCode Available	0

Show:10 25 50

← PrevPage 485 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified