SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3121–3130 of 5548 papers

Title	Date	Tasks	Status	Hype
FakeWatch ElectionShield: A Benchmarking Framework to Detect Fake News for Credible US Elections	Nov 27, 2023	ArticlesBenchmarking	—Unverified	0
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning	May 12, 2025	16kBenchmarking	—Unverified	0
Fantastic Questions and Where to Find Them: FairytaleQA--An Authentic Dataset for Narrative Comprehension	Nov 16, 2021	BenchmarkingQuestion Answering	—Unverified	0
Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension	May 1, 2022	BenchmarkingQuestion Answering	—Unverified	0
FarsBase-KBP: A Knowledge Base Population System for the Persian Knowledge Graph	May 4, 2020	BenchmarkingEntity Linking	—Unverified	0
Fast, approximate kinetics of RNA folding	Jan 19, 2015	Benchmarking	—Unverified	0
FastDraft: How to Train Your Draft	Nov 17, 2024	BenchmarkingCode Completion	—Unverified	0
Fast Empirical Scenarios	Jul 8, 2023	BenchmarkingDecision Making	—Unverified	0
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation	Sep 29, 2021	BenchmarkingImage Generation	—Unverified	0
Fast Labeling and Transcription with the Speechalyzer Toolkit	May 1, 2012	Audio ClassificationBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 313 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified