SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 21512200 of 10419 papers

TitleStatusHype
Instance-Conditional Knowledge Distillation for Object DetectionCode1
Learning Partial Correlation based Deep Visual Representation for Image ClassificationCode1
Data Feedback Loops: Model-driven Amplification of Dataset BiasesCode1
Learning Representational Invariances for Data-Efficient Action RecognitionCode1
Information Maximization Clustering via Multi-View Self-LabellingCode1
Continual Hippocampus Segmentation with TransformersCode1
Learning strides in convolutional neural networksCode1
DataMUX: Data Multiplexing for Neural NetworksCode1
Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-to-Fine strategies on Sparse DataCode1
Dataset Condensation with Contrastive SignalsCode1
Learning to Generalize: Meta-Learning for Domain GeneralizationCode1
Learning to Learn Parameterized Classification Networks for Scalable Input ImagesCode1
Information Bottleneck Approach to Spatial Attention LearningCode1
InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV ImagesCode1
Instance-Dependent Noisy Label Learning via Graphical ModellingCode1
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language ModelCode1
Incremental Learning Techniques for Semantic SegmentationCode1
Concept Learners for Few-Shot LearningCode1
Continual Learning with Scaled Gradient ProjectionCode1
DCT-CryptoNets: Scaling Private Inference in the Frequency DomainCode1
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionCode1
DEAL: Deep Evidential Active Learning for Image ClassificationCode1
DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic AugmentationCode1
Deblurring Masked Autoencoder is Better Recipe for Ultrasound Image RecognitionCode1
Indication as Prior Knowledge for Multimodal Disease Classification in Chest Radiographs with TransformersCode1
AutoAssist: A Framework to Accelerate Training of Deep Neural NetworksCode1
Incorporating Convolution Designs into Visual TransformersCode1
AFN: Adaptive Fusion Normalization via an Encoder-Decoder FrameworkCode1
Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional UnderstandingCode1
Less is More: Pay Less Attention in Vision TransformersCode1
A Survey: Deep Learning for Hyperspectral Image Classification with Few Labeled SamplesCode1
Leveraging Cross-Modal Neighbor Representation for Improved CLIP ClassificationCode1
Increasing Model Capacity for Free: A Simple Strategy for Parameter Efficient Fine-tuningCode1
Decoupled Weight Decay RegularizationCode1
Contrastive Deep SupervisionCode1
LFI-CAM: Learning Feature Importance for Better Visual ExplanationCode1
InfoMatch: Entropy Neural Estimation for Semi-Supervised Image ClassificationCode1
Deep AutoAugmentCode1
A Survey of Classical And Quantum Sequence ModelsCode1
LightViT: Towards Light-Weight Convolution-Free Vision TransformersCode1
No Routing Needed Between CapsulesCode1
Contrastive Learning Improves Model Robustness Under Label NoiseCode1
Instance Localization for Self-supervised Detection PretrainingCode1
Contrastive Learning of Generalized Game RepresentationsCode1
InceptionMamba: An Efficient Hybrid Network with Large Band Convolution and Bottleneck MambaCode1
Contrastive Learning of Medical Visual Representations from Paired Images and TextCode1
Deep Complex NetworksCode1
ConvMLP: Hierarchical Convolutional MLPs for VisionCode1
Locally Shifted Attention With Early Global IntegrationCode1
Deep Prototypical Networks with Hybrid Residual Attention for Hyperspectral Image ClassificationCode1
Show:102550
← PrevPage 44 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5DaViT-HTop 1 Accuracy90.2Unverified
6Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10Meta Pseudo Labels (EfficientNet-B6-Wide)Top 1 Accuracy90Unverified