SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3621–3630 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking performance of object detection under image distortions in an uncontrolled environment	Oct 28, 2022	BenchmarkingObject	CodeCode Available	0
Benchmarking Language Models for Code Syntax Understanding	Oct 26, 2022	Benchmarking	CodeCode Available	1
What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?	Oct 26, 2022	BenchmarkingQuestion Answering	CodeCode Available	0
pmuBAGE: The Benchmarking Assortment of Generated PMU Data for Power System Events	Oct 25, 2022	Benchmarking	CodeCode Available	0
CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization	Oct 25, 2022	Abstractive Text SummarizationBenchmarking	CodeCode Available	0
A Comparative Attention Framework for Better Few-Shot Object Detection on Aerial Images	Oct 25, 2022	BenchmarkingFew-Shot Object Detection	CodeCode Available	1
Deep Crowd Anomaly Detection: State-of-the-Art, Challenges, and Future Research Directions	Oct 25, 2022	Anomaly DetectionBenchmarking	—Unverified	0
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?	Oct 24, 2022	BenchmarkingDenoising	—Unverified	0
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition	Oct 24, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks	Oct 24, 2022	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 363 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified