SOTAVerified

Image Classification

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Papers

Showing 351400 of 10419 papers

TitleStatusHype
Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction UncertaintyCode1
From Pixels to Components: Eigenvector Masking for Visual Representation LearningCode1
Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning DynamicsCode1
Polynomial, trigonometric, and tropical activationsCode1
SPECIAL: Zero-shot Hyperspectral Image Classification With CLIPCode1
Communication-Efficient Federated Learning Based on Explanation-Guided Pruning for Remote Sensing Image ClassificationCode1
Merging Feed-Forward Sublayers for Compressed TransformersCode1
Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and BenchmarksCode1
VisionGRU: A Linear-Complexity RNN Model for Efficient Image AnalysisCode1
Beyond Gradient Averaging in Parallel Optimization: Improved Robustness through Gradient Agreement FilteringCode1
Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAGCode1
Continual Learning Using a Kernel-Based Method Over Foundation ModelsCode1
Mamba2D: A Natively Multi-Dimensional State-Space Model for Vision TasksCode1
Does VLM Classification Benefit from LLM Description Semantics?Code1
RapidNet: Multi-Level Dilated Convolution Based Mobile BackboneCode1
Revisiting Weight Averaging for Model MergingCode1
IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design PatentsCode1
Sparse autoencoders reveal selective remapping of visual concepts during adaptationCode1
Grounding Descriptions in Images informs Zero-Shot Visual RecognitionCode1
Dual-Branch Subpixel-Guided Network for Hyperspectral Image ClassificationCode1
Token Cropr: Faster ViTs for Quite a Few TasksCode1
On the Performance Analysis of Momentum Method: A Frequency Domain PerspectiveCode1
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image CollectionsCode1
Vision Mamba Distillation for Low-resolution Fine-grained Image ClassificationCode1
Spectral-Spatial Transformer with Active Transfer Learning for Hyperspectral Image ClassificationCode1
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image ClassificationCode1
CODE-CL: Conceptor-Based Gradient Projection for Deep Continual LearningCode1
MetaLA: Unified Optimal Linear Approximation to Softmax Attention MapCode1
Vision Eagle Attention: a new lens for advancing image classificationCode1
HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image ClassificationCode1
Training objective drives the consistency of representational similarity across datasetsCode1
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language ModelsCode1
Interpretable Image Classification with Adaptive Prototype-based Vision TransformersCode1
FewVS: A Vision-Semantics Integration Framework for Few-Shot Image ClassificationCode1
Is Less More? Exploring Token Condensation as Training-free Adaptation for CLIPCode1
Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual KnowledgeCode1
GlobalMamba: Global Image Serialization for Vision MambaCode1
Robust 3D Point Clouds Classification based on Declarative DefendersCode1
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionCode1
Bilinear MLPs enable weight-based mechanistic interpretabilityCode1
QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space ModelCode1
Parameter Efficient Fine-tuning via Explained Variance AdaptationCode1
NegMerge: Consensual Weight Negation for Strong Machine UnlearningCode1
FACMIC: Federated Adaptative CLIP Model for Medical Image ClassificationCode1
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image ClassificationCode1
MONICA: Benchmarking on Long-tailed Medical Image ClassificationCode1
Vision-Language Models are Strong Noisy Label DetectorsCode1
All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path AggregationCode1
Realistic Evaluation of Model Merging for Compositional GeneralizationCode1
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoECode1
Show:102550
← PrevPage 8 of 209Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CoCa (finetuned)Top 1 Accuracy91Unverified
2Model soups (BASIC-L)Top 1 Accuracy90.98Unverified
3Model soups (ViT-G/14)Top 1 Accuracy90.94Unverified
4DaViT-GTop 1 Accuracy90.4Unverified
5Meta Pseudo Labels (EfficientNet-L2)Top 1 Accuracy90.2Unverified
6DaViT-HTop 1 Accuracy90.2Unverified
7SwinV2-GTop 1 Accuracy90.17Unverified
8MAWS (ViT-6.5B)Top 1 Accuracy90.1Unverified
9Florence-CoSwin-HTop 1 Accuracy90.05Unverified
10RevCol-HTop 1 Accuracy90Unverified