SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 751800 of 10419 papers

TitleStatusHype
A Closer Look at Self-Supervised Lightweight Vision TransformersCode1
Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal InformationCode1
Addressing Failure Detection by Learning Model ConfidenceCode1
Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box AttacksCode1
Deep Roto-Translation Scattering for Object ClassificationCode1
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationCode1
Understanding the Role of the Projector in Knowledge DistillationCode1
Early-Learning Regularization Prevents Memorization of Noisy LabelsCode1
Deep Subdomain Adaptation Network for Image ClassificationCode1
Grafting Transformer on Automatically Designed Convolutional Neural Network for Hyperspectral Image ClassificationCode1
Boosting Convolutional Neural Networks with Middle Spectrum Grouped ConvolutionCode1
Boosting Active Learning via Improving Test PerformanceCode1
AlphaNet: Improved Training of Supernets with Alpha-DivergenceCode1
AutoDiCE: Fully Automated Distributed CNN Inference at the EdgeCode1
An Empirical Investigation of Representation Learning for ImitationCode1
An Empirical Investigation of the Role of Pre-training in Lifelong LearningCode1
Bayesian Neural Network Priors RevisitedCode1
AutoDC: Automated data-centric processingCode1
A deep active learning system for species identification and counting in camera trap imagesCode1
Boosting vision transformers for image retrievalCode1
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the FlyCode1
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies OthersCode1
Bootstrap your own latent: A new approach to self-supervised LearningCode1
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture SearchCode1
Deep Transfer Learning for Land Use and Land Cover Classification: A Comparative StudyCode1
DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed DatasetsCode1
Layer-adaptive sparsity for the Magnitude-based PruningCode1
An Enhanced Scheme for Reducing the Complexity of Pointwise Convolutions in CNNs for Image Classification Based on Interleaved Grouped Filters without Divisibility ConstraintsCode1
DEUP: Direct Epistemic Uncertainty PredictionCode1
Breast Cancer Histopathology Image Classification and Localization using Multiple Instance LearningCode1
An Ensemble of Simple Convolutional Neural Network Models for MNIST Digit RecognitionCode1
Leveraging Vision-Language Models for Improving Domain Generalization in Image ClassificationCode1
Efficient Classification of Very Large Images with Tiny ObjectsCode1
A Unified Algebraic Perspective on Lipschitz Neural NetworksCode1
A Neural Dirichlet Process Mixture Model for Task-Free Continual LearningCode1
Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-IdentificationCode1
AugNet: End-to-End Unsupervised Visual Representation Learning with Image AugmentationCode1
Deep Network Ensemble Learning applied to Image Classification using CNN TreesCode1
BSRBF-KAN: A combination of B-splines and Radial Basis Functions in Kolmogorov-Arnold NetworksCode1
BSNet: Bi-Similarity Network for Few-shot Fine-grained Image ClassificationCode1
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODECode1
ELSA: Enhanced Local Self-Attention for Vision TransformerCode1
A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain AdaptationCode1
Emerging Properties in Self-Supervised Vision TransformersCode1
Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz NetworksCode1
Cached Transformers: Improving Transformers with Differentiable Memory CacheCode1
Fcaformer: Forward Cross Attention in Hybrid Vision TransformerCode1
Encoder-Decoder with Atrous Separable Convolution for Semantic Image SegmentationCode1
AugMix: A Simple Data Processing Method to Improve Robustness and UncertaintyCode1
Deep Networks with Stochastic DepthCode1
Show:102550
← PrevPage 16 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5DaViT-HTop 1 Accuracy90.2Unverified
6Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10Meta Pseudo Labels (EfficientNet-B6-Wide)Top 1 Accuracy90Unverified