SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2701–2710 of 5548 papers

Title	Date	Tasks	Status	Hype
AI Matrix - Synthetic Benchmarks for DNN	Nov 27, 2018	BenchmarkingCPU	—Unverified	0
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training	Apr 30, 2025	Benchmarking	—Unverified	0
Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations	Dec 23, 2024	BenchmarkingQuestion Answering	—Unverified	0
Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks	May 24, 2024	BenchmarkingDecoder	—Unverified	0
GANmut: Generating and Modifying Facial Expressions	Jun 16, 2024	BenchmarkingDiversity	—Unverified	0
GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR	Apr 15, 2025	Benchmarking	—Unverified	0
FactLens: Benchmarking Fine-Grained Fact Verification	Nov 8, 2024	BenchmarkingFact Verification	—Unverified	0
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics	Mar 27, 2025	BenchmarkingNatural Language Queries	—Unverified	0
FACT: Learning Governing Abstractions Behind Integer Sequences	Sep 20, 2022	Benchmarking	—Unverified	0
Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy	Dec 4, 2024	AnatomyBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 271 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified