SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 791–800 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Large Multimodal Models against Common Corruptions	Jan 22, 2024	BenchmarkingImage to text	CodeCode Available	1	5
Benchmarking Adversarial Patch Against Aerial Detection	Oct 30, 2022	Benchmarking	CodeCode Available	1	5
dMelodies: A Music Dataset for Disentanglement Learning	Jul 29, 2020	BenchmarkingDisentanglement	CodeCode Available	1	5
GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks	Mar 23, 2025	BenchmarkingHallucination	CodeCode Available	1	5
Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models	Jul 16, 2024	BenchmarkingCode Generation	CodeCode Available	1	5
Benchmarking Adversarial Robustness on Image Classification	Jun 1, 2020	Adversarial AttackAdversarial Robustness	CodeCode Available	1	5
Benchmarking of DL Libraries and Models on Mobile Devices	Feb 14, 2022	BenchmarkingGPU	CodeCode Available	1	5
GLGENN: A Novel Parameter-Light Equivariant Neural Networks Architecture Based on Clifford Geometric Algebras	Jun 11, 2025	Benchmarking	CodeCode Available	1	5
DNN+NeuroSim V2.0: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators for On-chip Training	Mar 13, 2020	BenchmarkingQuantization	CodeCode Available	1	5
Does your model understand genes? A benchmark of gene properties for biological and text models	Dec 5, 2024	BenchmarkingMulti-class Classification	CodeCode Available	1	5

Show:10 25 50

← PrevPage 80 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified