SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 45514600 of 10420 papers

TitleStatusHype
Robust Cross-Modal Representation Learning with Progressive Self-Distillation0
Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic ReviewCode1
Representation Learning by Detecting Incorrect Location EmbeddingsCode0
Is my Driver Observation Model Overconfident? Input-guided Calibration Networks for Reliable and Interpretable Confidence Estimates0
Knowledge-Free Black-Box Watermark and Ownership Proof for Image Classification Neural Networks0
Neuronal diversity can improve machine learning for physics and beyondCode0
A Survey of Supernet Optimization and its Applications: Spatial and Temporal Optimization for Neural Architecture Search0
Does Robustness on ImageNet Transfer to Downstream Tasks?0
Multimodal Quasi-AutoRegression: Forecasting the visual popularity of new fashion products0
Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification0
Multi-Sample ζ-mixup: Richer, More Realistic Synthetic Samples from a p-Series Interpolant0
DeepTensor: Low-Rank Tensor Decomposition with Deep Network Priors0
Unified Contrastive Learning in Image-Text-Label SpaceCode2
DaViT: Dual Attention Vision TransformersCode2
Total Variation Optimization Layers for Computer VisionCode1
Solving ImageNet: a Unified Scheme for Training any Backbone to Top ResultsCode2
MixFormer: Mixing Features across Windows and DimensionsCode0
CAIPI in Practice: Towards Explainable Interactive Medical Image Classification0
Banana Sub-Family Classification and Quality Prediction using Computer Vision0
Universal Representations: A Unified Look at Multiple Task and Domain LearningCode1
Contextual Attention Mechanism, SRGAN Based Inpainting System for Eliminating Interruptions from Images0
Fine-Grained Predicates Learning for Scene Graph GenerationCode1
Real-time Hyperspectral Imaging in Hardware via Trained Metasurface EncodersCode1
Rethinking Visual Geo-localization for Large-Scale ApplicationsCode2
LatentGAN Autoencoder: Learning Disentangled Latent Distribution0
MetaAudio: A Few-Shot Audio Classification BenchmarkCode1
Interpretable Saliency Maps And Self-Supervised Learning For Generalized Zero Shot Medical Image Classification0
How stable are Transferability Metrics evaluations?0
MultiMAE: Multi-modal Multi-task Masked AutoencodersCode2
BatchFormerV2: Exploring Sample Relationships for Dense Representation LearningCode2
MaxViT: Multi-Axis Vision TransformerCode3
Attribute Prototype Network for Any-Shot Learning0
Co-Teaching for Unsupervised Domain Adaptation and ExpansionCode0
Revisiting a kNN-based Image Classification System with High-capacity Storage0
Improving Vision Transformers by Revisiting High-frequency ComponentsCode1
Kernel Extreme Learning Machine Optimized by the Sparrow Search Algorithm for Hyperspectral Image Classification0
Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural NetworksCode0
Mix-up Self-Supervised Learning for Contrast-agnostic Applications0
Efficient Convolutional Neural Networks on Raspberry Pi for Image ClassificationCode1
Matching Feature Sets for Few-Shot Image Classification0
Proper Reuse of Image Classification Features Improves Object Detection0
Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification0
Efficient Maximal Coding Rate Reduction by Variational Forms0
Multimodal Fusion Transformer for Remote Sensing Image ClassificationCode1
Deep Hyperspectral Unmixing using Transformer NetworkCode1
Weakly Supervised Patch Label Inference Networks for Efficient Pavement Distress Detection and Recognition in the WildCode0
Conditional Autoregressors are Interpretable Classifiers0
A fuzzy distance-based ensemble of deep models for cervical cancer detectionCode1
Fair Contrastive Learning for Facial Attribute ClassificationCode1
Collaborative Transformers for Grounded Situation RecognitionCode1
Show:102550
← PrevPage 92 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified