SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 851900 of 10419 papers

TitleStatusHype
Universal Domain Adaptation for Remote Sensing Image Scene ClassificationCode1
Discovering and Mitigating Visual Biases through Keyword ExplanationCode1
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on GradientsCode1
Trainable Activations for Image ClassificationCode1
Discriminator-free Unsupervised Domain Adaptation for Multi-label Image ClassificationCode1
Navigating the Pitfalls of Active Learning Evaluation: A Systematic Framework for Meaningful Performance AssessmentCode1
Lightweight Neural Architecture Search for Temporal Convolutional Networks at the EdgeCode1
Local Window Attention Transformer for Polarimetric SAR Image ClassificationCode1
Diagnose Like a Pathologist: Transformer-Enabled Hierarchical Attention-Guided Multiple Instance Learning for Whole Slide Image ClassificationCode1
Modeling Uncertain Feature Representation for Domain GeneralizationCode1
CLIP the Gap: A Single Domain Generalization Approach for Object DetectionCode1
Dynamic Grained Encoder for Vision TransformersCode1
Learning Support and Trivial Prototypes for Interpretable Image ClassificationCode1
MoBYv2AL: Self-supervised Active Learning for Image ClassificationCode1
TinyMIM: An Empirical Study of Distilling MIM Pre-trained ModelsCode1
Knockoffs-SPR: Clean Sample Selection in Learning with Noisy LabelsCode1
Rate Gradient Approximation Attack Threats Deep Spiking Neural NetworksCode1
Efficient On-device Training via Gradient FilteringCode1
Deep Factorized Metric LearningCode1
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse RetrievalCode1
PODA: Prompt-driven Zero-shot Domain AdaptationCode1
DISC: Learning From Noisy Labels via Dynamic Instance-Specific Selection and CorrectionCode1
Class-Aware Patch Embedding Adaptation for Few-Shot Image ClassificationCode1
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image ClassificationCode1
Vision HGNN: An Image is More than a Graph of NodesCode1
AdaptiveMix: Improving GAN Training via Feature Space ShrinkageCode1
Adaptive and Background-Aware Vision Transformer for Real-Time UAV TrackingCode1
ViewNet: A Novel Projection-Based Backbone With View Pooling for Few-Shot Point Cloud ClassificationCode1
A General Regret Bound of Preconditioned Gradient Method for DNN TrainingCode1
FCCNs: Fully Complex-valued Convolutional Networks using Complex-valued Color Model and Loss FunctionCode1
LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer NormalizationCode1
Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced DataCode1
Learning Multimodal Data Augmentation in Feature SpaceCode1
Shape-Aware Fine-Grained Classification of Erythroid CellsCode1
Image Classification with Small Datasets: Overview and BenchmarkCode1
On Calibrating Semantic Segmentation Models: Analyses and An AlgorithmCode1
MaskingDepth: Masked Consistency Regularization for Semi-supervised Monocular Depth EstimationCode1
Style-Hallucinated Dual Consistency Learning: A Unified Framework for Visual Domain GeneralizationCode1
Convolution-enhanced Evolving Attention NetworksCode1
Post-hoc Uncertainty Learning using a Dirichlet Meta-ModelCode1
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and LanguageCode1
Reproducible scaling laws for contrastive language-image learningCode1
Domain Generalization by Learning and Removing Domain-specific FeaturesCode1
Regularized Optimal Transport Layers for Generalized Global Pooling OperationsCode1
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group PropagationCode1
DISCO: Adversarial Defense with Local Implicit FunctionsCode1
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies OthersCode1
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive LearningCode1
PØDA: Prompt-driven Zero-shot Domain AdaptationCode1
Causal Inference via Style Transfer for Out-of-distribution GeneralisationCode1
Show:102550
← PrevPage 18 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified