SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 12011250 of 10419 papers

TitleStatusHype
DualConv: Dual Convolutional Kernels for Lightweight Deep Neural NetworksCode1
Faster hyperspectral image classification based on selective kernel mechanism using deep convolutional networksCode1
Source-Free Progressive Graph Learning for Open-Set Domain AdaptationCode1
DKDFN: Domain Knowledge-Guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classificationCode1
Open-set Adversarial Defense with Clean-Adversarial Mutual LearningCode1
Indication as Prior Knowledge for Multimodal Disease Classification in Chest Radiographs with TransformersCode1
Entroformer: A Transformer-based Entropy Model for Learned Image CompressionCode1
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of AttentionCode1
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep LearningCode1
Image Difference Captioning with Pre-training and Contrastive LearningCode1
Class Distance Weighted Cross-Entropy Loss for Ulcerative Colitis Severity EstimationCode1
L2B: Learning to Bootstrap Robust Models for Combating Label NoiseCode1
Uncertainty Modeling for Out-of-Distribution GeneralizationCode1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera IntrinsicsCode1
Dataset Condensation with Contrastive SignalsCode1
Diversify and Disambiguate: Learning From Underspecified DataCode1
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and LanguageCode1
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer ModelsCode1
Learning strides in convolutional neural networksCode1
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone DecompositionsCode1
When Do Flat Minima Optimizers Work?Code1
Fortuitous Forgetting in Connectionist NetworksCode1
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data AugmentationsCode1
UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANsCode1
Rate Coding or Direct Coding: Which One is Better for Accurate, Robust, and Energy-efficient Spiking Neural Networks?Code1
DynaMixer: A Vision MLP Architecture with Dynamic MixingCode1
Toward Training at ImageNet Scale with Differential PrivacyCode1
An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural NetworksCode1
Speeding up Heterogeneous Federated Learning with Sequentially Trained SuperclientsCode1
Sphere2Vec: Multi-Scale Representation Learning over a Spherical Surface for Geospatial PredictionsCode1
Convolutional Xformers for VisionCode1
Revisiting Global Pooling through the Lens of Optimal TransportCode1
Adaptive DropBlock Enhanced Generative Adversarial Networks for Hyperspectral Image ClassificationCode1
Revisiting Weakly Supervised Pre-Training of Visual Perception ModelsCode1
PT4AL: Using Self-Supervised Pretext Tasks for Active LearningCode1
It's All in the Head: Representation Knowledge Distillation through Classifier SharingCode1
The CLEAR Benchmark: Continual LEArning on Real-World ImageryCode1
Glance and Focus Networks for Dynamic Visual RecognitionCode1
Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo ReplayCode1
BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split ComputingCode1
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window AttentionCode1
A Conservative Approach for Unbiased Learning on Unknown BiasesCode1
Learnable Lookup Table for Neural Network QuantizationCode1
Learn From Others and Be Yourself in Heterogeneous Federated LearningCode1
Optimal Representations for Covariate ShiftCode1
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural NetworksCode1
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language ModelCode1
PRIME: A few primitives can boost robustness to common corruptionsCode1
Vision Transformer for Small-Size DatasetsCode1
Augmenting Convolutional networks with attention-based aggregationCode1
Show:102550
← PrevPage 25 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified