SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 15011550 of 10419 papers

TitleStatusHype
Vision Transformers with Hierarchical AttentionCode1
Meta-Learning with Fewer Tasks through Task InterpolationCode1
RegionViT: Regional-to-Local Attention for Vision TransformersCode1
Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamicsCode1
FedBABU: Towards Enhanced Representation for Federated Image ClassificationCode1
Efficient Classification of Very Large Images with Tiny ObjectsCode1
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationCode1
Evidential Turing ProcessesCode1
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image ClassificationCode1
Towards Robust Classification Model by Counterfactual and Invariant Data GenerationCode1
Container: Context Aggregation NetworkCode1
Hyperspectral Band Selection for Multispectral Image Classification with Convolutional NetworksCode1
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger TokensCode1
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image RecognitionCode1
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest ImagesCode1
TransMatcher: Deep Image Matching Through Transformers for Generalizable Person Re-identificationCode1
EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural NetworkCode1
Less is More: Pay Less Attention in Vision TransformersCode1
A Spectral-Spatial-Dependent Global Learning Framework for Insufficient and Imbalanced Hyperspectral Image ClassificationCode1
ResT: An Efficient Transformer for Visual RecognitionCode1
Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by DesignCode1
Predict then Interpolate: A Simple Algorithm to Learn Stable ClassifiersCode1
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual UnderstandingCode1
FedScale: Benchmarking Model and System Performance of Federated Learning at ScaleCode1
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the FlyCode1
Extremely Lightweight Quantization Robust Real-Time Single-Image Super Resolution for Mobile DevicesCode1
Superpixel-based Knowledge Infusion in Deep Neural Networks for Image ClassificationCode1
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge DistillationCode1
A Novel lightweight Convolutional Neural Network, ExquisiteNetV2Code1
Relative Positional Encoding for Transformers with Linear ComplexityCode1
Sparse Spiking Gradient DescentCode1
Towards Robust Vision TransformerCode1
Vision Transformers are Robust LearnersCode1
Pay Attention to MLPsCode1
Semi-Supervised Classification and Segmentation on High Resolution Aerial ImagesCode1
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model ConfigurationsCode1
Segmenter: Transformer for Semantic SegmentationCode1
Generalized Jensen-Shannon Divergence Loss for Learning with Noisy LabelsCode1
A Bregman Learning Framework for Sparse Neural NetworksCode1
Conformer: Local Features Coupling Global Representations for Visual RecognitionCode1
Truly shift-equivariant convolutional neural networks with adaptive polyphase upsamplingCode1
DiagSet: a dataset for prostate cancer histopathological image classificationCode1
Diffusion Mechanism in Residual Neural Network: Theory and ApplicationsCode1
ResMLP: Feedforward networks for image classification with data-efficient trainingCode1
SparseConvMIL: Sparse Convolutional Context-Aware Multiple Instance Learning for Whole Slide Image ClassificationCode1
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNetCode1
VideoLT: Large-scale Long-tailed Video RecognitionCode1
This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep NetworksCode1
Soft-Attention Improves Skin Cancer Classification PerformanceCode1
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image RecognitionCode1
Show:102550
← PrevPage 31 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified