SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 11011150 of 10419 papers

TitleStatusHype
Towards Better Understanding Attribution MethodsCode1
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence NetworkCode1
EXACT: How to Train Your AccuracyCode1
A graph-transformer for whole slide image classificationCode1
Masked Image Modeling with Denoising ContrastCode1
An Empirical Investigation of Representation Learning for ImitationCode1
Leveraging Uncertainty for Deep Interpretable Classification and Weakly-Supervised Segmentation of Histology ImagesCode1
Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task SamplingCode1
Explainable Deep Learning Methods in Medical Image Classification: A SurveyCode1
Introspective Deep Metric Learning for Image RetrievalCode1
When does dough become a bagel? Analyzing the remaining mistakes on ImageNetCode1
CCMB: A Large-scale Chinese Cross-modal BenchmarkCode1
Investigating and Explaining the Frequency Bias in Image ClassificationCode1
Image Classification With Small Datasets: Overview and BenchmarkCode1
CoCa: Contrastive Captioners are Image-Text Foundation ModelsCode1
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)Code1
Better plain ViT baselines for ImageNet-1kCode1
Engineering flexible machine learning systems by traversing functionally-invariant pathsCode1
NeuralEF: Deconstructing Kernels by Deep Neural NetworksCode1
Semantic Information Recovery in Wireless NetworksCode1
Unlocking High-Accuracy Differentially Private Image Classification through ScaleCode1
Learning to Split for Automatic Bias DetectionCode1
Causal Transportability for Visual RecognitionCode1
Adaptive Split-Fusion TransformerCode1
A survey on attention mechanisms for medical applications: are we moving towards better algorithms?Code1
PolyLoss: A Polynomial Expansion Perspective of Classification Loss FunctionsCode1
Surpassing the Human Accuracy: Detecting Gallbladder Cancer from USG Images with Curriculum LearningCode1
Self-supervised Learning for Sonar Image ClassificationCode1
Continual Hippocampus Segmentation with TransformersCode1
Learning with SignaturesCode1
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language TasksCode1
Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a DifferenceCode1
DeiT III: Revenge of the ViTCode1
ViTOL: Vision Transformer for Weakly Supervised Object LocalizationCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
No Token Left Behind: Explainability-Aided Image Classification and GenerationCode1
SuperpixelGridCut, SuperpixelGridMean and SuperpixelGridMix Data AugmentationCode1
Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic ReviewCode1
Total Variation Optimization Layers for Computer VisionCode1
Fine-Grained Predicates Learning for Scene Graph GenerationCode1
Universal Representations: A Unified Look at Multiple Task and Domain LearningCode1
MetaAudio: A Few-Shot Audio Classification BenchmarkCode1
Real-time Hyperspectral Imaging in Hardware via Trained Metasurface EncodersCode1
Improving Vision Transformers by Revisiting High-frequency ComponentsCode1
Efficient Convolutional Neural Networks on Raspberry Pi for Image ClassificationCode1
Deep Hyperspectral Unmixing using Transformer NetworkCode1
Multimodal Fusion Transformer for Remote Sensing Image ClassificationCode1
Collaborative Transformers for Grounded Situation RecognitionCode1
A fuzzy distance-based ensemble of deep models for cervical cancer detectionCode1
Fair Contrastive Learning for Facial Attribute ClassificationCode1
Show:102550
← PrevPage 23 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified