SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2351–2360 of 5548 papers

Title	Date	Tasks	Status	Hype
Keras Sig: Efficient Path Signature Computation on GPU in Keras 3	Jan 14, 2025	BenchmarkingC++ code	—Unverified	0
Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings	Jan 14, 2025	BenchmarkingQuestion Answering	—Unverified	0
Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series Classification	Jan 14, 2025	BenchmarkingGraph Representation Learning	CodeCode Available	0
Lessons From Red Teaming 100 Generative AI Products	Jan 13, 2025	BenchmarkingRed Teaming	—Unverified	0
Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks	Jan 13, 2025	Benchmarking	CodeCode Available	0
Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles	Jan 13, 2025	ArticlesBenchmarking	—Unverified	0
The Paradox of Success in Evolutionary and Bioinspired Optimization: Revisiting Critical Issues, Key Studies, and Methodological Pathways	Jan 13, 2025	BenchmarkingMetaheuristic Optimization	—Unverified	0
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI	Jan 13, 2025	ARCBenchmarking	—Unverified	0
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure	Jan 12, 2025	BenchmarkingHyperparameter Optimization	—Unverified	0
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition	Jan 10, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0

Show:10 25 50

← PrevPage 236 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified