SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 16511700 of 10419 papers

TitleStatusHype
CAM-Based Methods Can See through WallsCode0
Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance0
ImageNot: A contrast with ImageNet preserves model rankingsCode0
Cross-to-merge training with class balance strategy for learning with noisy labelsCode0
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text GuidanceCode0
Instance-Aware Group Quantization for Vision Transformers0
Can Biases in ImageNet Models Explain Generalization?Code1
Parallel Proportional Fusion of Spiking Quantum Neural Network for Optimizing Image Classification0
Diagnosis of Skin Cancer Using VGG16 and VGG19 Based Transfer Learning Models0
Improving Visual Recognition with Hyperbolical Visual Hierarchy MappingCode1
Harnessing The Power of Attention For Patch-Based Biomedical Image Classification0
Computation and Communication Efficient Lightweighting Vertical Federated Learning for Smart Building IoT0
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via NegationsCode1
Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover ClassificationCode0
MCNet: A crowd denstity estimation network based on integrating multiscale attention module0
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection0
Diverse Feature Learning by Self-distillation and Reset0
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design ApproachCode1
Enhance Image Classification via Inter-Class Image Mixup with Diffusion ModelCode1
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTsCode2
The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch Curation0
RSMamba: Remote Sensing Image Classification with State Space ModelCode3
Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning0
Multi-scale Unified Network for Image Classification0
The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer VisionCode0
Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks0
Mini-Gemini: Mining the Potential of Multi-modality Vision Language ModelsCode7
Targeted Visualization of the Backbone of Encoder LLMsCode1
The Need for Speed: Pruning Transformers with One RecipeCode1
Boosting Few-Shot Learning with Disentangled Self-Supervised Learning and Meta-Learning for Medical Image Classification0
Tiny Models are the Computational Saver for Large ModelsCode0
PlainMamba: Improving Non-Hierarchical Mamba in Visual RecognitionCode3
Task2Box: Box Embeddings for Modeling Asymmetric Task RelationshipsCode0
Histogram Layers for Neural Engineered FeaturesCode0
Enhancing Neural Network Representations with Prior Knowledge-Based NormalizationCode0
DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural NetworksCode1
Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer0
Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis0
CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming DataCode0
On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition0
Multi-Task Learning with Multi-Task Optimization0
A Deep Learning Architectures for Kidney Disease Classification0
iDAT: inverse Distillation Adapter-TuningCode1
VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image ClassificationCode1
Do not trust what you trust: Miscalibration in Semi-supervised LearningCode0
ParFormer: A Vision Transformer with Parallel Mixer and Sparse Channel Attention Patch Embedding0
Extracting Human Attention through Crowdsourced Patch Labeling0
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion0
Clean-image Backdoor Attacks0
Image Classification with Rotation-Invariant Variational Quantum Circuits0
Show:102550
← PrevPage 34 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified