SOTAVerified

Benchmarking

Papers

Showing 25212530 of 5548 papers

TitleStatusHype
Abnormality-Driven Representation Learning for Radiology Imaging0
Performance Benchmarking of Psychomotor Skills Using Wearable Devices: An Application in Sport0
A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation0
Benchmarking Active Learning for NILM0
ChemSafetyBench: Benchmarking LLM Safety on Chemistry DomainCode0
Reassessing Layer Pruning in LLMs: New Insights and MethodsCode0
AdamZ: An Enhanced Optimisation Method for Neural Network TrainingCode0
Benchmarking the Robustness of Optical Flow Estimation to CorruptionsCode0
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains0
Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise LevelsCode0
Show:102550
← PrevPage 253 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified