SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 5548 papers

Title	Date	Tasks	Status	Hype
Assessing SPARQL capabilities of Large Language Models	Sep 9, 2024	BenchmarkingKnowledge Graphs	CodeCode Available	2
A Survey on Multimodal Benchmarks: In the Era of Large AI Models	Sep 21, 2024	BenchmarkingSurvey	CodeCode Available	2
FaceScore: Benchmarking and Enhancing Face Quality in Human Generation	Jun 24, 2024	BenchmarkingDenoising	CodeCode Available	2
GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning	Jul 4, 2025	BenchmarkingGraph Generation	CodeCode Available	2
GSCodec Studio: A Modular Framework for Gaussian Splat Compression	Jun 2, 2025	Benchmarking	CodeCode Available	2
Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping	Nov 5, 2024	BenchmarkingCode Generation	CodeCode Available	2
Event-Based Motion Magnification	Feb 19, 2024	BenchmarkingMotion Detection	CodeCode Available	2
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation	Aug 17, 2022	BenchmarkingCode Generation	CodeCode Available	2
Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception	Jun 10, 2023	3D Object DetectionBenchmarking	CodeCode Available	2
Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details	Feb 1, 2021	Benchmarkingobject-detection	CodeCode Available	2

Show:10 25 50

← PrevPage 19 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified