SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 5548 papers

Title	Date	Tasks	Status	Hype
Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video	Jan 24, 2025	3D ReconstructionBenchmarking	CodeCode Available	2
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?	Jan 9, 2025	BenchmarkingVideo Understanding	CodeCode Available	2
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark	Jan 1, 2025	BenchmarkingImage Segmentation	CodeCode Available	2
An OpenMind for 3D medical vision self-supervised learning	Dec 22, 2024	BenchmarkingSelf-Supervised Learning	CodeCode Available	2
XRAG: eXamining the Core -- Benchmarking Foundational Components in Advanced Retrieval-Augmented Generation	Dec 20, 2024	BenchmarkingDiagnostic	CodeCode Available	2
AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving	Dec 19, 2024	Autonomous DrivingBenchmarking	CodeCode Available	2
Open Universal Arabic ASR Leaderboard	Dec 18, 2024	Benchmarking	CodeCode Available	2
NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models	Dec 14, 2024	BenchmarkingDrug Design	CodeCode Available	2
EvalGIM: A Library for Evaluating Generative Image Models	Dec 13, 2024	BenchmarkingDiversity	CodeCode Available	2
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2

Show:10 25 50

← PrevPage 19 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified