SOTAVerified

Benchmarking

Papers

Showing 39513975 of 5548 papers

TitleStatusHype
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery0
A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality0
Efficient, Uncertainty-based Moderation of Neural Networks Text ClassifiersCode0
Coarse-to-Fine Q-attention with Learned Path RankingCode1
pmuBAGE: The Benchmarking Assortment of Generated PMU Data for Power System Events -- Part I: Overview and ResultsCode0
Intelligence at the Extreme Edge: A Survey on Reformable TinyML0
Multi-Class Road User Detection With 3+1D Radar in the View-of-Delft DatasetCode2
Unitail: Detecting, Reading, and Matching in Retail Scene0
Assessing the risk of re-identification arising from an attack on anonymised data0
Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?0
To Find Waldo You Need Contextual Cues: Debiasing Who's WaldoCode0
Earnings-22: A Practical Benchmark for Accents in the WildCode1
Parameter-efficient Model Adaptation for Vision TransformersCode1
Treatment Learning Causal Transformer for Noisy Image Classification0
A Unified Study of Machine Learning Explanation Evaluation Metrics0
Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices0
Benchmarking Algorithms for Automatic License Plate Recognition0
Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative ComprehensionCode1
Visual Abductive ReasoningCode1
LAMBDA: Covering the Solution Set of Black-Box Inequality by Search Space Quantization0
Benchmarking Visual Localization for Autonomous NavigationCode1
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language ModelsCode1
An Optical Control Environment for Benchmarking Reinforcement Learning AlgorithmsCode0
Comprehensive Benchmark Datasets for Amharic Scene Text Detection and Recognition0
Show:102550
← PrevPage 159 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified