SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–810 of 5548 papers

Title	Date	Tasks	Status	Hype
FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding	Sep 28, 2023	BenchmarkingImage Retrieval	CodeCode Available	1
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis	Mar 9, 2021	BenchmarkingClassification	CodeCode Available	1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models	Nov 29, 2021	BenchmarkingPhysical Simulations	CodeCode Available	1
CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation	Feb 26, 2025	BenchmarkingCode Generation	CodeCode Available	1
Benchmarking AI scientists in omics data-driven biological research	May 13, 2025	BenchmarkingMultiple-choice	CodeCode Available	1
Foundation Model of Electronic Medical Records for Adaptive Risk Estimation	Feb 10, 2025	Benchmarking	CodeCode Available	1
A Dataset for Answering Time-Sensitive Questions	Aug 13, 2021	Benchmarking	CodeCode Available	1
Benchmarking Algorithms for Federated Domain Generalization	Jul 11, 2023	BenchmarkingDiversity	CodeCode Available	1
Benchmarking Algorithms for Submodular Optimization Problems Using IOHProfiler	Feb 2, 2023	BenchmarkingEvolutionary Algorithms	CodeCode Available	1
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies	Jun 15, 2025	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 81 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified