SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2931–2940 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
The Forchheim Image Database for Camera Identification in the Wild	Nov 4, 2020	BenchmarkingFact Checking	—Unverified	0	0
MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models	Jun 11, 2024	BenchmarkingFairness	—Unverified	0	0
How Universal are Universal Dependencies? Exploiting Syntax for Multilingual Clause-level Sentiment Detection	May 1, 2020	BenchmarkingBIG-bench Machine Learning	—Unverified	0	0
Benchmarking Transformers-based models on French Spoken Language Understanding tasks	Jul 19, 2022	BenchmarkingSpoken Language Understanding	—Unverified	0	0
How well it works: Benchmarking performance of GPT models on medical natural language processing tasks	Jun 12, 2024	Benchmarking	—Unverified	0	0
You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain	Jan 23, 2025	BenchmarkingDomain Adaptation	—Unverified	0	0
The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech	Apr 17, 2021	Benchmarking	—Unverified	0	0
The Impact of Genomic Variation on Function (IGVF) Consortium	Jul 24, 2023	Benchmarking	—Unverified	0	0
A General Taylor Framework for Unifying and Revisiting Attribution Methods	May 28, 2021	BenchmarkingDecision Making	—Unverified	0	0
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing	Feb 14, 2020	Benchmarking	—Unverified	0	0

Show:10 25 50

← PrevPage 294 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified