SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3071–3080 of 5548 papers

Title	Date	Tasks	Status	Hype
It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives	Jun 12, 2024	AllBenchmarking	—Unverified	0
Reinforcement Learning to Disentangle Multiqubit Quantum States from Partial Observations	Jun 12, 2024	BenchmarkingDeep Reinforcement Learning	CodeCode Available	0
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets	Jun 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases	Jun 12, 2024	BenchmarkingModel Compression	—Unverified	0
A PRISMA Driven Systematic Review of Publicly Available Datasets for Benchmark and Model Developments for Industrial Defect Detection	Jun 11, 2024	BenchmarkingDefect Detection	—Unverified	0
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing	Jun 11, 2024	BenchmarkingStance Detection	—Unverified	0
Benchmarking Vision-Language Contrastive Methods for Medical Representation Learning	Jun 11, 2024	BenchmarkingContrastive Learning	CodeCode Available	0
DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition	Jun 11, 2024	BenchmarkingCross-corpus	—Unverified	0
Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images	Jun 11, 2024	BenchmarkingGPU	—Unverified	0
MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models	Jun 11, 2024	BenchmarkingFairness	—Unverified	0

Show:10 25 50

← PrevPage 308 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified