SOTAVerified

Image Retrieval

Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a given query from a large database. It is often considered a form of fine-grained, instance-level classification. The task is integral to image recognition alongside classification and cross-modal retrieval. By leveraging visual similarity and other criteria, image retrieval enables users to efficiently discover relevant images, making it a crucial tool in applications such as search and recommendation.

Extending CLIP for Category-to-image Retrieval in E-commerce

( Image credit: DELF )

Papers

Showing 251300 of 2239 papers

TitleStatusHype
Data-Efficient Multimodal Fusion on a Single GPUCode1
CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessmentCode1
One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIPCode1
Bi-directional Training for Composed Image Retrieval via Text Prompt LearningCode1
Data-Free Sketch-Based Image RetrievalCode1
DeepPatent: Large scale patent drawing recognition and retrievalCode1
Cross-Scale Context Extracted Hashing for Fine-Grained Image Binary EncodingCode1
CREPE: Can Vision-Language Foundation Models Reason Compositionally?Code1
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language TransformersCode1
CoSMo: Content-Style Modulation for Image Retrieval With Text FeedbackCode1
3rd Place Solution to "Google Landmark Retrieval 2020"Code1
Conversational Fashion Image Retrieval via Multiturn Natural Language FeedbackCode1
CoVR-2: Automatic Data Construction for Composed Video RetrievalCode1
DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place RecognitionCode1
ConText-CIR: Learning from Concepts in Text for Composed Image RetrievalCode1
Conditioned and Composed Image Retrieval Combining and Partially Fine-Tuning CLIP-Based FeaturesCode1
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image RetrievalCode1
Boosting vision transformers for image retrievalCode1
Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View TransformerCode1
MosAIc: Finding Artistic Connections across Culture with Conditional Image RetrievalCode1
Contextually Affinitive Neighborhood Refinery for Deep ClusteringCode1
Contrastive Quantization with Code Memory for Unsupervised Image RetrievalCode1
Correlation Verification for Image RetrievalCode1
Breaking the Frame: Visual Place Recognition by Overlap PredictionCode1
Compositional Learning of Image-Text Query for Image RetrievalCode1
Cross-Batch Memory for Embedding LearningCode1
Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image RetrievalCode1
Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image RetrievalCode1
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural NetworkCode1
CurlingNet: Compositional Learning between Images and Text for Fashion IQ DataCode1
DAS: Densely-Anchored Sampling for Deep Metric LearningCode1
DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based LocalizationCode1
Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-IdentificationCode1
Composing Text and Image for Image Retrieval - An Empirical OdysseyCode1
BroadFace: Looking at Tens of Thousands of People at Once for Face RecognitionCode1
A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent AttentionCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
CaLa: Complementary Association Learning for Augmenting Composed Image RetrievalCode1
CgAT: Center-Guided Adversarial Training for Deep Hashing-Based RetrievalCode1
Delta Descriptors: Change-Based Place Representation for Robust Visual LocalizationCode1
CBVS: A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search ScenariosCode1
Detection and Retrieval of Out-of-Distribution Objects in Semantic SegmentationCode1
Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex InteractionsCode1
Discriminative Region-based Multi-Label Zero-Shot LearningCode1
Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal EncoderCode1
Domain-invariant Similarity Activation Map Contrastive Learning for Retrieval-based Long-term Visual LocalizationCode1
Unsupervised Sketch-to-Photo SynthesisCode1
comp-syn: Perceptually Grounded Word Embeddings with ColorCode1
Contextual Similarity Aggregation with Self-attention for Visual Re-rankingCode1
Composed Image Retrieval for Training-Free Domain ConversionCode1
Show:102550
← PrevPage 6 of 45Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SuperGlobalmAP80.2Unverified
2AMESmAP80Unverified
3Hypergraph propagation+community selectionmAP73Unverified
4TokenmAP66.57Unverified
5DELG+ α QE reranking+ RRT rerankingmAP64Unverified
6FIRemAP61.2Unverified
7HOWmAP56.9Unverified
8ResNet101+ArcFace GLDv2-train-cleanmAP51.6Unverified
9DELF–HQE+SPmAP50.3Unverified
10HesAff–rSIFT–HQE+SPmAP49.7Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP90.7Unverified
2Hypergraph propagation+Community selectionmAP88.4Unverified
3TokenmAP82.28Unverified
4FIRemAP81.8Unverified
5DELG+ α QE reranking + RRT rerankingmAP80.4Unverified
6HOWmAP79.4Unverified
7ResNet101+ArcFace GLDv2-train-cleanmAP74.2Unverified
8DELF–HQE+SPmAP73.4Unverified
9HesAff–rSIFT–HQE+SPmAP71.3Unverified
10DELF–ASMK*+SPmAP67.8Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP89.7Unverified
2SuperGlobalmAP86.7Unverified
3Hypergraph propagationmAP83.3Unverified
4TokenmAP78.56Unverified
5DELG+ α QE reranking + RRT rerankingmAP77.7Unverified
6ResNet101+ArcFace GLDv2-train-cleanmAP70.3Unverified
7FIRemAP70Unverified
8DELF–HQE+SPmAP69.3Unverified
9HOWmAP62.4Unverified
10R–R-MACmAP59.4Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP94.9Unverified
2Hypergraph propagationmAP92.6Unverified
3TokenmAP89.34Unverified
4DELG+ α QE reranking + RRT rerankingmAP88.5Unverified
5FIRemAP85.3Unverified
6ResNet101+ArcFace GLDv2-train-cleanmAP84.9Unverified
7DELF–HQE+SPmAP84Unverified
8HOWmAP81.6Unverified
9R–R-MACmAP78.9Unverified
10R–GeMmAP77.2Unverified
#ModelMetricClaimedVerifiedStatus
1Swin-T (MosaiCLIP, CC-12M)Recall@1 (HN-Atom, UC)44.5Unverified
2RN-50 (MosaiCLIP, CC-12M)Recall@1 (HN-Atom, UC)44.4Unverified
3MosaiCLIP (YFCC-FT)Recall@1 (HN-Atom, UC)41.5Unverified
4RN-50 (NegCLIP, CC-12M)Recall@1 (HN-Atom, UC)41.4Unverified
5MosaiCLIP (CC-FT)Recall@1 (HN-Atom, UC)40.9Unverified
6Swin-T (NegCLIP, CC-12M)Recall@1 (HN-Atom, UC)39.6Unverified
7CLIP (YFCC-FT)Recall@1 (HN-Atom, UC)39.5Unverified
8ViT-L-14 (LAION400M)Recall@1 (HN-Atom + HN-Comp, SC)39.44Unverified
9NegCLIP (YFCC-FT)Recall@1 (HN-Atom, UC)39Unverified
10CLIP-FT (YFCC-FT)Recall@1 (HN-Atom, UC)38.3Unverified
#ModelMetricClaimedVerifiedStatus
1DQU-CIR(Recall@10+Recall@50)/271.77Unverified
2TMCIR(Recall@10+Recall@50)/266.56Unverified
3SPN4CIR (SPRC)(Recall@10+Recall@50)/266.41Unverified
4SPRC(Recall@10+Recall@50)/264.85Unverified
5Candidate Set Re-ranking(Recall@10+Recall@50)/262.15Unverified
6RUTIR (BLIP B/16)(Recall@10+Recall@50)/261.32Unverified
7CASE(Recall@10+Recall@50)/259.73Unverified
8CaLa(Recall@10+Recall@50)/257.96Unverified
9BLIP4CIR+Bi(Recall@10+Recall@50)/255.4Unverified
10CLIP4Cir (v3)(Recall@10+Recall@50)/255.36Unverified
#ModelMetricClaimedVerifiedStatus
1X-VLM (base)R@186.9Unverified
2RCARR@162.6Unverified
3SGRAFR@158.5Unverified
4LGSGMR@157.4Unverified
5VisualSpartaR@157.4Unverified
6TERAN MrSwR@156.5Unverified
7TERAN Symm.R@155.7Unverified
8VSRNR@154.7Unverified
9CAMPR@151.5Unverified
10SCAN i-tR@144Unverified
#ModelMetricClaimedVerifiedStatus
1TMCIR(Recall@5+Recall_subset@1)/283.46Unverified
2SPN4CIR (SPRC)(Recall@5+Recall_subset@1)/282.69Unverified
3SPRC2(Recall@5+Recall_subset@1)/282.66Unverified
4SPRC(Recall@5+Recall_subset@1)/281.39Unverified
5Candidate Set Re-ranking(Recall@5+Recall_subset@1)/280.9Unverified
6CaLa(Recall@5+Recall_subset@1)/278.74Unverified
7CASE (Pre-trained on LaSCo.Ca)(Recall@5+Recall_subset@1)/278.25Unverified
8CASE(Recall@5+Recall_subset@1)/277.5Unverified
9VISTA (base)(Recall@5+Recall_subset@1)/275.9Unverified
10MMRet-MLLM(Recall@5+Recall_subset@1)/275.7Unverified
#ModelMetricClaimedVerifiedStatus
1Unicom+ViT-L@336pxR@191.2Unverified
2ROADMAP (DeiT-B)R@186Unverified
3CGD (SG/GS)R@184.2Unverified
4ROADMAP (ResNet-50)R@183.1Unverified
5ProxyNCA++R@181.4Unverified
6PNP LossR@181.1Unverified
7Cross-Batch MemoryR@180.6Unverified
8Smooth-APR@180.1Unverified
9NormSoftmax2048 (ResNet-50)R@179.5Unverified
10EPSHN512R@178.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternVL-G-FTR@185.9Unverified
2InternVL-C-FTR@185.2Unverified
3CN-CLIP (ViT-L/14@336px)R@184.4Unverified
4R2D2 (ViT-L/14)R@184.4Unverified
5CN-CLIP (ViT-H/14)R@183.8Unverified
6CN-CLIP (ViT-L/14)R@182.7Unverified
7CN-CLIP (ViT-B/16)R@179.1Unverified
8R2D2 (ViT-B)R@178.3Unverified
9Wukong (ViT-L/14)R@177.4Unverified
10Wukong (ViT-B/32)R@167.6Unverified
#ModelMetricClaimedVerifiedStatus
1Offline DiffusionMAP96.2Unverified
2CNN+IME layerMAP92Unverified
3DELF+FT+ATT+DIR+QEMAP90Unverified
4DIR+QE*MAP89Unverified