SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3841–3850 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
NEWTS: A Corpus for News Topic-Focused Summarization	May 31, 2022	BenchmarkingText Summarization	—Unverified	0	0
NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction	May 21, 2025	BenchmarkingHallucination	—Unverified	0	0
Next-generation MRD assays: do we have the tools to evaluate them properly?	Oct 31, 2023	BenchmarkingSensitivity	—Unverified	0	0
Benchmarking confound regression strategies for the control of motion artifact in studies of functional connectivity	Aug 11, 2016	BenchmarkingFunctional Connectivity	—Unverified	0	0
NL2KQL: From Natural Language to Kusto Query	Apr 3, 2024	BenchmarkingNatural Language Queries	—Unverified	0	0
Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E5	Sep 9, 2024	BenchmarkingInformation Retrieval	—Unverified	0	0
Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise	Jan 3, 2023	BenchmarkingClassification	—Unverified	0	0
NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems	Mar 7, 2024	BenchmarkingDependency Parsing	—Unverified	0	0
A CUDA-Based Real Parameter Optimization Benchmark	Jul 29, 2014	BenchmarkingCPU	—Unverified	0	0
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation	Sep 29, 2023	BenchmarkingFederated Learning	—Unverified	0	0

Show:10 25 50

← PrevPage 385 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified