SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 36513700 of 10419 papers

TitleStatusHype
Feature Weaken: Vicinal Data Augmentation for Classification0
Personalized Federated Learning with Hidden Information on Personalized Prior0
Towards Adversarial Robustness of Deep Vision Algorithms0
Non-Coherent Over-the-Air Decentralized Gradient Descent0
TensAIR: Real-Time Training of Neural Networks from Data-streamsCode0
Invariant Learning via Diffusion Dreamed Distribution Shifts0
Vision Transformers in Medical Imaging: A Review0
Hyperbolic Sliced-Wasserstein via Geodesic and Horospherical ProjectionsCode0
A Transformer Framework for Data Fusion and Multi-Task Learning in Smart CitiesCode0
Contrastive Losses Are Natural Criteria for Unsupervised Video SummarizationCode1
Efficient Feature Compression for Edge-Cloud SystemsCode0
FedFA: Federated Learning with Feature Anchors to Align Features and Classifiers for Heterogeneous DataCode1
Data-Centric Debugging: mitigating model failures via targeted data collection0
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual InformationCode1
DeepVoxNet2: Yet another CNN frameworkCode1
Improving the Computer-Aided Estimation of Ulcerative Colitis Severity According to Mayo Endoscopic Score by Using Regression-Based Deep LearningCode1
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks0
Scalar Invariant Networks with Zero Bias0
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle0
Probabilistic Deep Metric Learning for Hyperspectral Image ClassificationCode0
Identifying Spurious Correlations and Correcting them with an Explanation-based Learning0
Local Magnification for Data and Feature Augmentation0
Bayesian Federated Neural Matching that Completes Full Information0
Will Large-scale Generative Models Corrupt Future Datasets?Code0
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative LearningCode1
VCI-LSTM: Vector Choquet Integral-based Long Short-Term Memory0
Robust Deep Learning for Autonomous DrivingCode1
Interpreting Bias in the Neural Networks: A Peek Into Representational Similarity0
PKCAM: Previous Knowledge Channel Attention ModuleCode1
EVA: Exploring the Limits of Masked Visual Representation Learning at ScaleCode0
Fcaformer: Forward Cross Attention in Hybrid Vision TransformerCode1
Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image Classification0
SA-DPSGD: Differentially Private Stochastic Gradient Descent based on Simulated Annealing0
Sign Language to Text Conversion in Real Time using Transfer Learning0
Perceptual Video Coding for Machines via Satisfied Machine Ratio ModelingCode1
Mining Unseen Classes via Regional Objectness: A Simple Baseline for Incremental SegmentationCode1
Enhancing Few-shot Image Classification with Cosine TransformerCode1
Far Away in the Deep Space: Dense Nearest-Neighbor-Based Out-of-Distribution DetectionCode1
MultiCrossViT: Multimodal Vision Transformer for Schizophrenia Prediction using Structural MRI and Functional Network Connectivity Data0
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning0
AltCLIP: Altering the Language Encoder in CLIP for Extended Language CapabilitiesCode4
Depth and Representation in Vision ModelsCode0
A Comprehensive Survey of Transformers for Computer Vision0
Token Transformer: Can class token help window-based transformer build better long-range interactions?0
Equivariance with Learned Canonicalization Functions0
Dual Complementary Dynamic Convolution for Image Recognition0
REVEL Framework to measure Local Linear Explanations for black-box models: Deep Learning Image Classification case of studyCode0
PAD-Net: An Efficient Framework for Dynamic NetworksCode1
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable ConvolutionsCode4
MGiaD: Multigrid in all dimensions. Efficiency and robustness by coarsening in resolution and channel dimensions0
Show:102550
← PrevPage 74 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified