SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 16011650 of 10419 papers

TitleStatusHype
Firefly Neural Architecture Descent: a General Approach for Growing Neural NetworksCode1
DLME: Deep Local-flatness Manifold EmbeddingCode1
Diverse Sample Generation: Pushing the Limit of Generative Data-free QuantizationCode1
DKDFN: Domain Knowledge-Guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classificationCode1
Diversified in-domain synthesis with efficient fine-tuning for few-shot classificationCode1
Diversify and Disambiguate: Learning From Underspecified DataCode1
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive ArchitectureCode1
Do Deep Networks Transfer Invariances Across Classes?Code1
DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain Medical ImagesCode1
Neural Architecture Search for Lightweight Non-Local NetworksCode1
Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedded PlatformsCode1
DocXClassifier: High Performance Explainable Deep Network for Document Image ClassificationCode1
NeuralEF: Deconstructing Kernels by Deep Neural NetworksCode1
Neural Ensemble Search for Uncertainty Estimation and Dataset ShiftCode1
Fine-Grained Predicates Learning for Scene Graph GenerationCode1
A New Semi-supervised Learning Benchmark for Classifying View and Diagnosing Aortic Stenosis from EchocardiogramsCode1
Mixture Outlier Exposure: Towards Out-of-Distribution Detection in Fine-grained EnvironmentsCode1
Fcaformer: Forward Cross Attention in Hybrid Vision TransformerCode1
Fine-grained Recognition with Learnable Semantic Data AugmentationCode1
Domain Adaptation for Multi-label Image Classification: a Discriminator-free ApproachCode1
Does VLM Classification Benefit from LLM Description Semantics?Code1
A Less Biased Evaluation of Out-of-distribution Sample DetectorsCode1
Domain-Adversarial Training of Neural NetworksCode1
Non-convex Learning via Replica Exchange Stochastic Gradient MCMCCode1
Non-Local Neural Networks With Grouped Bilinear Attentional TransformsCode1
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer ModelsCode1
Can An Image Classifier Suffice For Action Recognition?Code1
Domain Generalization by Learning and Removing Domain-specific FeaturesCode1
NormKD: Normalized Logits for Knowledge DistillationCode1
No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification ProblemsCode1
Fine-grained Classes and How to Find ThemCode1
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift AdaptationCode1
Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse TrainingCode1
CamDiff: Camouflage Image Augmentation via Diffusion ModelCode1
Circumventing Outliers of AutoAugment with Knowledge DistillationCode1
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide ImagesCode1
3D U^2-Net: A 3D Universal U-Net for Multi-Domain Medical Image SegmentationCode1
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual FeaturesCode1
Fine-Grained Self-Supervised Learning with Jigsaw Puzzles for Medical Image ClassificationCode1
CHiLS: Zero-Shot Image Classification with Hierarchical Label SetsCode1
Can Biases in ImageNet Models Explain Generalization?Code1
Do text-free diffusion models learn discriminative visual representations?Code1
FILIP: Fine-grained Interactive Language-Image Pre-TrainingCode1
Astroformer: More Data Might not be all you need for ClassificationCode1
An In-depth Study of Stochastic BackpropagationCode1
Can Language Understand Depth?Code1
A Dual-Direction Attention Mixed Feature Network for Facial Expression RecognitionCode1
Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?Code1
Conditional Positional Encodings for Vision TransformersCode1
Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning ParadigmCode1
Show:102550
← PrevPage 33 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5DaViT-HTop 1 Accuracy90.2Unverified
6Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10Meta Pseudo Labels (EfficientNet-B6-Wide)Top 1 Accuracy90Unverified