SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1781–1790 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation	Oct 2, 2023	BenchmarkingContinual Learning	CodeCode Available	0	5
Integrating Expert Knowledge into Logical Programs via LLMs	Feb 17, 2025	BenchmarkingLogical Reasoning	CodeCode Available	0	5
Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark	May 9, 2016	BenchmarkingEmotion Recognition	CodeCode Available	0	5
inMOTIFin: a lightweight end-to-end simulation software for regulatory sequences	Jun 25, 2025	Benchmarking	CodeCode Available	0	5
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition	Dec 23, 2021	BenchmarkingDeep Learning	CodeCode Available	0	5
Bugs in the Data: How ImageNet Misrepresents Biodiversity	Aug 24, 2022	BenchmarkingObject Detection	CodeCode Available	0	5
CleanPatrick: A Benchmark for Image Data Cleaning	May 16, 2025	BenchmarkingLabel Error Detection	CodeCode Available	0	5
BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images	Sep 7, 2018	Benchmarking	CodeCode Available	0	5
bsnsing: A decision tree induction method based on recursive optimal boolean rule composition	May 30, 2022	Benchmarking	CodeCode Available	0	5
BSBench: will your LLM find the largest prime number?	Jun 5, 2025	Benchmarking	CodeCode Available	0	5

Show:10 25 50

← PrevPage 179 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified