SOTAVerified

Fine-Grained Image Classification

Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.

( Image credit: Looking for the Devil in the Details )

Papers

Showing 51100 of 353 papers

TitleStatusHype
Human-in-the-Loop Visual Re-ID for Population Size EstimationCode0
Good Questions Help Zero-Shot Image ReasoningCode1
Generative Parameter-Efficient Fine-TuningCode1
Meta Co-Training: Two Views are Better than OneCode1
OmniVec: Learning robust representations with cross modal sharing0
A Simple Interpretable Transformer for Fine-Grained Image Classification and AnalysisCode1
Dining on Details: LLM-Guided Expert Networks for Fine-Grained Food Recognition0
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and MappingCode1
Gramian Attention Heads are Strong yet Efficient Vision LearnersCode0
Learning with Unmasked Tokens Drives Stronger Vision LearnersCode1
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning HurtsCode1
Delving into Multimodal Prompting for Fine-grained Visual Classification0
PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and HumansCode0
Masking Strategies for Background Bias Removal in Computer Vision ModelsCode1
Multiscale patch-based feature graphs for image classificationCode0
Deep Neural Networks Fused with Textures for Image Classification0
Fine-Grained Sports, Yoga, and Dance Postures Recognition: A Benchmark Analysis0
Task-Oriented Channel Attention for Fine-Grained Few-Shot ClassificationCode1
GIST: Generating Image-Specific Text for Fine-grained Object ClassificationCode1
Diffusion Models Beat GANs on Image ClassificationCode1
Semantically-Prompted Language Models Improve Visual Descriptions0
TOAST: Transfer Learning via Attention SteeringCode1
Feature Channel Adaptive Enhancement for Fine-Grained Visual Classification0
Salient Mask-Guided Vision Transformer for Fine-Grained ClassificationCode1
Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive FieldsCode0
Leaf Cultivar Identification via Prototype-enhanced Learning0
Reduction of Class Activation Uncertainty with Background InformationCode1
PVP: Pre-trained Visual Parameter-Efficient Tuning0
Learning Partial Correlation based Deep Visual Representation for Image ClassificationCode1
DINOv2: Learning Robust Visual Features without SupervisionCode6
Your Diffusion Model is Secretly a Zero-Shot ClassifierCode2
Take 5: Interpretable Image Classification with a Handful of FeaturesCode1
Learn from Each Other to Classify Better: Cross-layer Mutual Attention Learning for Fine-grained Visual ClassificationCode1
Cascading Hierarchical Networks with Multi-task Balanced Loss for Fine-grained hashingCode0
Fine-grained Visual Classification with High-temperature Refinement and Background SuppressionCode1
Semantic Feature Integration network for Fine-grained Visual Classification0
Fine-Grained Visual Classification via Internal Ensemble Learning TransformerCode1
How to Use Dropout Correctly on Residual Networks with Batch NormalizationCode0
LiT Tuned Models for Efficient Species DetectionCode0
On the Ideal Number of Groups for Isometric Gradient Propagation0
The CropAndWeed Dataset: A Multi-Modal Learning Approach for Efficient Crop and Weed ManipulationCode1
Multi-View Active Fine-Grained Visual RecognitionCode0
An Erudite Fine-Grained Visual Classification Model0
TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification0
Part-guided Relational Transformers for Fine-grained Visual RecognitionCode1
Towards Scene Understanding for Autonomous Operations on Airport ApronsCode1
Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image ClassificationCode1
Data Augmentation Vision Transformer for Fine-grained Image Classification0
Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual ClassificationCode1
2nd Place Solution to Google Universal Image EmbeddingCode1
Show:102550
← PrevPage 2 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1TResnet-L + PMDAccuracy97.3Unverified
2CMAL-NetAccuracy97.1Unverified
3I2-HOFIAccuracy96.92Unverified
4TResNet-L + ML-DecoderAccuracy96.41Unverified
5DATAccuracy96.2Unverified
6ALIGNAccuracy96.13Unverified
7SR-GNNAccuracy96.1Unverified
8EffNet-L2 (SAM)Accuracy95.96Unverified
9SaSPA + CALAccuracy95.72Unverified
10CAPAccuracy95.7Unverified
#ModelMetricClaimedVerifiedStatus
1I2-HOFIAccuracy96.42Unverified
2SR-GNNAccuracy95.4Unverified
3Inceptionv4Accuracy95.11Unverified
4CAPAccuracy94.9Unverified
5CMAL-NetAccuracy94.7Unverified
6TBMSL-NetAccuracy94.7Unverified
7CSQA-NetAccuracy94.7Unverified
8PARTAccuracy94.6Unverified
9AENetAccuracy94.5Unverified
10SaSPA + CALAccuracy94.5Unverified
#ModelMetricClaimedVerifiedStatus
1HERBSAccuracy93.1Unverified
2PIMAccuracy92.8Unverified
3MDCMAccuracy92.5Unverified
4SFETransAccuracy91.8Unverified
5CAPAccuracy91.8Unverified
6IELTAccuracy91.8Unverified
7TransFGAccuracy91.7Unverified
8SWAG (ViT H/14)Accuracy91.7Unverified
9ViT-NeTAccuracy91.7Unverified
10FFVTAccuracy91.6Unverified
#ModelMetricClaimedVerifiedStatus
1HERBSAccuracy93Unverified
2MetaFormer (MetaFormer-2,384)Accuracy93Unverified
3PIMAccuracy92.8Unverified
4ViT-NeT (SwinV2-B)Accuracy92.5Unverified
5MPSAAccuracy92.5Unverified
6CSQA-NetAccuracy92.3Unverified
7I2-HOFIAccuracy92.12Unverified
8MDCMAccuracy92Unverified
9CGLAccuracy91.7Unverified
10SR-GNNAccuracy91.2Unverified