SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 18011850 of 10419 papers

TitleStatusHype
DyCE: Dynamically Configurable Exiting for Deep Learning Compression and Real-time ScalingCode0
When do Convolutional Neural Networks Stop Learning?Code0
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like ArchitecturesCode4
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating FunctionCode0
Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image ClassificationCode1
Transformers for Supervised Online Continual Learning0
Beyond Inference: Performance Analysis of DNN Server Overheads for Computer Vision0
ELA: Efficient Local Attention for Deep Convolutional Neural Networks0
Can a Confident Prior Replace a Cold Posterior?Code0
Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation0
VisionLLaMA: A Unified LLaMA Backbone for Vision TasksCode3
SURE: SUrvey REcipes for building reliable and robust deep networksCode2
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation ModelCode0
HyenaPixel: Global Image Context with ConvolutionsCode0
Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic InteractionCode1
Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification0
Decompose-and-Compose: A Compositional Approach to Mitigating Spurious CorrelationCode0
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance0
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization0
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness0
Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains0
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling0
A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric EstimationCode0
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers0
Scaling Supervised Local Learning with Augmented Auxiliary NetworksCode0
SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image ClassificationCode1
Offline Writer Identification Using Convolutional Neural Network Activation Features0
Searching a Lightweight Network Architecture for Thermal Infrared Pedestrian Tracking0
Enhancing Continuous Domain Adaptation with Multi-Path Transfer Curriculum0
Investigating the Robustness of Vision Transformers against Label Noise in Medical Image Classification0
DEYO: DETR with YOLO for End-to-End Object DetectionCode2
MV-Swin-T: Mammogram Classification with Multi-view Swin TransformerCode1
Intelligent Known and Novel Aircraft Recognition -- A Shift from Classification to Similarity Learning for Combat Identification0
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network AccelerationCode0
Key Design Choices in Source-Free Unsupervised Domain Adaptation: An In-depth Empirical Analysis0
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends0
Foveated Retinotopy Improves Classification and Localization in CNNs0
G-RepsNet: A Fast and General Construction of Equivariant Networks for Arbitrary Matrix Groups0
SoK: Analyzing Adversarial Examples: A Framework to Study Adversary Knowledge0
PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction ModelsCode0
Partial Search in a Frozen Network is Enough to Find a Strong Lottery Ticket0
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates0
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena0
LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception TasksCode1
Integrating kNN with Foundation Models for Adaptable and Privacy-Aware Image ClassificationCode0
Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI PoolingCode1
Perceiving Longer Sequences With Bi-Directional Cross-Attention TransformersCode1
CowScape: Quantitative reconstruction of the conformational landscape of biological macromolecules from cryo-EM data0
Efficient Multimodal Learning from Data-centric PerspectiveCode5
ReViT: Enhancing Vision Transformers Feature Diversity with Attention Residual ConnectionsCode1
Show:102550
← PrevPage 37 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified