SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 49515000 of 10420 papers

TitleStatusHype
Augmenting Convolutional networks with attention-based aggregationCode1
PRIME: A few primitives can boost robustness to common corruptionsCode1
A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision0
Deep Curriculum Learning for PolSAR Image Classification0
Virtuoso: Video-based Intelligence for real-time tuning on SOCs0
ELSA: Enhanced Local Self-Attention for Vision TransformerCode1
Improving Robustness and Uncertainty Modelling in Neural Ordinary Differential Equations0
Latent Time Neural Ordinary Differential Equations0
AED: An black-box NLP classifier model attacker0
Cross-Part Learning for Fine-Grained Image ClassificationCode0
Dynamically Stable Poincaré Embeddings for Neural Manifolds0
Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint0
RepMLPNet: Hierarchical Vision MLP with Re-parameterized LocalityCode1
Learned Queries for Efficient Local AttentionCode1
A Vision-based Solution for Track Misalignment Detection0
Encoding Hierarchical Information in Neural Networks helps in Subpopulation Shift0
Learning with Label Noise for Image Retrieval by Selecting Interactions0
HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical ImagesCode1
Transformers Can Do Bayesian InferenceCode1
General Greedy De-bias LearningCode0
Denoised Labels for Financial Time-Series Data via Self-Supervised Learning0
Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive Survey0
Interpretable and Interactive Deep Multiple Instance Learning for Dental Caries Classification in Bitewing X-raysCode0
Rank4Class: A Ranking Formulation for Multiclass Classification0
Soundify: Matching Sound Effects to Video0
UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality BarrierCode1
Towards End-to-End Image Compression and Analysis with TransformersCode1
Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image RecognitionCode1
A Simple Single-Scale Vision Transformer for Object Localization and Instance SegmentationCode0
An Empirical Investigation of the Role of Pre-training in Lifelong LearningCode1
How to augment your ViTs? Consistency loss and StyleAug, a random style transfer augmentation0
Use Image Clustering to Facilitate Technology Assisted Review0
Classification of diffraction patterns using a convolutional neural network in single particle imaging experiments performed at X-ray free-electron lasers0
RegionCLIP: Region-based Language-Image PretrainingCode1
Learning to Prompt for Continual LearningCode1
Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise ImagesCode1
Learning Interpretable Models Through Multi-Objective Neural Architecture SearchCode0
SGML: A Symmetric Graph Metric Learning Framework for Efficient Hyperspectral Image ClassificationCode0
Towards General and Efficient Active LearningCode1
Exploring Category-correlated Feature for Few-shot Image Classification0
EMDS-6: Environmental Microorganism Image Dataset Sixth Version for Image Denoising, Segmentation, Feature Extraction, Classification and Detection Methods Evaluation0
Heuristic Hyperparameter Optimization for Convolutional Neural Networks using Genetic AlgorithmCode1
Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text0
An Interpretive Constrained Linear Model for ResNet and MgNet0
AdaViT: Adaptive Tokens for Efficient Vision TransformerCode1
WOOD: Wasserstein-based Out-of-Distribution DetectionCode1
Simple and Robust Loss Design for Multi-Label Learning with Missing LabelsCode1
Efficient and Reliable Overlay Networks for Decentralized Federated Learning0
Magnifying Networks for Images with Billions of Pixels0
A Discriminative Channel Diversification Network for Image ClassificationCode0
Show:102550
← PrevPage 100 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified