SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4641–4650 of 5548 papers

Title	Date	Tasks	Status	Hype
AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare	May 26, 2025	BenchmarkingMedical Diagnosis	CodeCode Available	0
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey	May 5, 2023	BenchmarkingImage Generation	CodeCode Available	0
How Far Are We from Optimal Reasoning Efficiency?	Jun 8, 2025	16kBenchmarking	CodeCode Available	0
Magnetic Resonance Imaging Feature-Based Subtyping and Model Ensemble for Enhanced Brain Tumor Segmentation	Dec 5, 2024	BenchmarkingBrain Tumor Segmentation	CodeCode Available	0
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations	Sep 10, 2024	BenchmarkingPoint Cloud Registration	CodeCode Available	0
Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework	May 30, 2025	Benchmarking	CodeCode Available	0
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking	Oct 11, 2021	BenchmarkingQuestion Answering	CodeCode Available	0
Malliavin-Mancino estimators implemented with non-uniform fast Fourier transforms	Mar 5, 2020	Benchmarking	CodeCode Available	0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios	Jun 11, 2025	Action RecognitionAction Segmentation	CodeCode Available	0
HOEG: A New Approach for Object-Centric Predictive Process Monitoring	Apr 8, 2024	BenchmarkingGraph Neural Network	CodeCode Available	0

Show:10 25 50

← PrevPage 465 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified