SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3201–3210 of 5548 papers

Title	Date	Tasks	Status	Hype
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future	Aug 5, 2024	BenchmarkingCode Generation	—Unverified	0
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising	Apr 30, 2025	BenchmarkingComputational Efficiency	—Unverified	0
From Private to Public: Benchmarking GANs in the Context of Private Time Series Classification	Mar 28, 2023	BenchmarkingPrivacy Preserving	—Unverified	0
From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution	Apr 9, 2024	Benchmarking	—Unverified	0
From Sound Representation to Model Robustness	Jul 27, 2020	Adversarial AttackAdversarial Robustness	—Unverified	0
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems	Jun 5, 2025	BenchmarkingRAG	—Unverified	0
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference	Oct 4, 2023	BenchmarkingGPU	—Unverified	0
FSD-10: A Dataset for Competitive Sports Content Analysis	Feb 9, 2020	Action RecognitionBenchmarking	—Unverified	0
Full-scale modal testing of a Hawk T1A aircraft for benchmarking vibration-based methods	Oct 6, 2023	BenchmarkingExperimental Design	—Unverified	0
Full-stack evaluation of Machine Learning inference workloads for RISC-V systems	May 24, 2024	BenchmarkingDeep Learning	—Unverified	0

Show:10 25 50

← PrevPage 321 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified