Image Retrieval

Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a given query from a large database. It is often considered a form of fine-grained, instance-level classification. The task is integral to image recognition alongside classification and cross-modal retrieval. By leveraging visual similarity and other criteria, image retrieval enables users to efficiently discover relevant images, making it a crucial tool in applications such as search and recommendation.

Extending CLIP for Category-to-image Retrieval in E-commerce

( Image credit: DELF )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 2239 papers

Title	Date	Tasks	Status	Hype
Garment Attribute Manipulation with Multi-level Attention	Sep 16, 2024	AttributeImage Retrieval	—Unverified	0
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval	Sep 14, 2024	Contrastive LearningImage Retrieval	CodeCode Available	4
A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions	Sep 10, 2024	Image RetrievalRetrieval	—Unverified	0
Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension Guiding	Sep 9, 2024	Image RetrievalReferring Expression	CodeCode Available	0
Open-World Dynamic Prompt and Continual Visual Representation Learning	Sep 9, 2024	Continual LearningImage Retrieval	—Unverified	0
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity	Sep 7, 2024	Image CaptioningImage Retrieval	CodeCode Available	0
Zero-Shot Whole Slide Image Retrieval in Histopathology Using Embeddings of Foundation Models	Sep 6, 2024	DiagnosticImage Retrieval	—Unverified	0
Design and Evaluation of Camera-Centric Mobile Crowdsourcing Applications	Sep 4, 2024	Image RetrievalRetrieval	—Unverified	0
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval	Sep 4, 2024	Image RetrievalRAG	CodeCode Available	1
Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment	Sep 3, 2024	Image RetrievalRetrieval	CodeCode Available	0
Evidential Transformers for Improved Image Retrieval	Sep 2, 2024	Content-Based Image RetrievalImage Retrieval	—Unverified	0
A Review of Image Retrieval Techniques: Data Augmentation and Adversarial Learning Approaches	Sep 2, 2024	Data AugmentationImage Retrieval	—Unverified	0
Rethinking Sparse Lexical Representations for Image Retrieval in the Age of Rising Multi-Modal Large Language Models	Aug 29, 2024	Data AugmentationImage Retrieval	—Unverified	0
Temporal Attention for Cross-View Sequential Image Localization	Aug 28, 2024	Image RetrievalRetrieval	CodeCode Available	0
Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild	Aug 27, 2024	Cross-Modal RetrievalImage Retrieval	—Unverified	0
LowCLIP: Adapting the CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task	Aug 25, 2024	Computational EfficiencyImage Augmentation	CodeCode Available	0
Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations	Aug 21, 2024	GPUImage Retrieval	—Unverified	0
UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation	Aug 21, 2024	Image GenerationImage Retrieval	CodeCode Available	1
Fashion Image-to-Image Translation for Complementary Item Retrieval	Aug 19, 2024	Image RetrievalImage-to-Image Translation	—Unverified	0
BrewCLIP: A Bifurcated Representation Learning Framework for Audio-Visual Retrieval	Aug 19, 2024	Image RetrievalRepresentation Learning	—Unverified	0
Cross-Modal Denoising: A Novel Training Paradigm for Enhancing Speech-Image Retrieval	Aug 15, 2024	cross-modal alignmentDenoising	—Unverified	0
Coarse-to-fine Alignment Makes Better Speech-image Retrieval	Aug 15, 2024	cross-modal alignmentImage Retrieval	—Unverified	0
DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions	Aug 15, 2024	Image RetrievalLanguage Modelling	—Unverified	0
Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network	Aug 10, 2024	geo-localizationImage Retrieval	CodeCode Available	2
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval	Aug 6, 2024	Image RetrievalRe-Ranking	CodeCode Available	1
On Validation of Search & Retrieval of Tissue Images in Digital Pathology	Aug 2, 2024	Content-Based Image RetrievalDiagnostic	—Unverified	0
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation	Jul 24, 2024	AvgCross-Modal Retrieval	—Unverified	0
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark	Jul 18, 2024	GPUImage Retrieval	CodeCode Available	1
EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp Diagnosis	Jul 16, 2024	Content-Based Image RetrievalContrastive Learning	—Unverified	0
Addressing Image Hallucination in Text-to-Image Generation through Factual Image Retrieval	Jul 15, 2024	Common Sense ReasoningHallucination	—Unverified	0
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations	Jul 15, 2024	AllImage Retrieval	CodeCode Available	1
An experimental evaluation of Siamese Neural Networks for robot localization using omnidirectional imaging in indoor environments	Jul 15, 2024	Image Retrieval	—Unverified	0
Are They the Same Picture? Adapting Concept Bottleneck Models for Human-AI Collaboration in Image Retrieval	Jul 12, 2024	Image RetrievalRetrieval	CodeCode Available	0
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval	Jul 11, 2024	Image RetrievalImage to text	CodeCode Available	2
Lifelong Histopathology Whole Slide Image Retrieval via Distance Consistency Rehearsal	Jul 11, 2024	Image RetrievalRetrieval	CodeCode Available	0
Multi-Group Proportional Representation in Retrieval	Jul 11, 2024	Image RetrievalRetrieval	CodeCode Available	0
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding	Jul 9, 2024	Contrastive LearningDomain Adaptation	—Unverified	0
HyCIR: Boosting Zero-Shot Composed Image Retrieval with Synthetic Labels	Jul 8, 2024	Contrastive LearningImage Retrieval	—Unverified	0
Pseudo-triplet Guided Few-shot Composed Image Retrieval	Jul 8, 2024	Active LearningImage Retrieval	—Unverified	0
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning	Jul 5, 2024	AllImage Retrieval	CodeCode Available	0
Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models	Jul 4, 2024	Dialogue UnderstandingImage Retrieval	CodeCode Available	0
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features	Jul 3, 2024	Image RetrievalRetrieval	CodeCode Available	0
Celeb-FBI: A Benchmark Dataset on Human Full Body Images and Age, Gender, Height and Weight Estimation using Deep Learning Approach	Jul 3, 2024	Image Retrieval	—Unverified	0
Cross-Modal Attention Alignment Network with Auxiliary Text Description for zero-shot sketch-based image retrieval	Jul 1, 2024	cross-modal alignmentImage Retrieval	—Unverified	0
Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval	Jul 1, 2024	DisentanglementImage Retrieval	—Unverified	0
PathAlign: A vision-language model for whole slide images in histopathology	Jun 27, 2024	DiagnosticImage Retrieval	—Unverified	0
Zero-shot Composed Image Retrieval Considering Query-target Relationship Leveraging Masked Image-text Pairs	Jun 27, 2024	Image RetrievalLanguage Modeling	—Unverified	0
WV-Net: A foundation model for SAR WV-mode satellite imagery trained using contrastive self-supervised learning on 10 million images	Jun 26, 2024	Image RetrievalSelf-Supervised Learning	CodeCode Available	0
Breaking the Frame: Visual Place Recognition by Overlap Prediction	Jun 23, 2024	Image RetrievalPose Estimation	CodeCode Available	1
CLIP-Branches: Interactive Fine-Tuning for Text-Image Retrieval	Jun 19, 2024	Image RetrievalInformation Retrieval	CodeCode Available	0

Show:10 25 50

← PrevPage 5 of 45Next →

All datasets ROxford (Hard)ROxford (Medium)RParis (Hard)RParis (Medium)CREPE (Compositional REPresentation Evaluation)Fashion IQ Flickr30K 1K test CIRR SOP Flickr30k-CN Oxf5k Flickr30k

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	SuperGlobal	mAP	80.2	—	Unverified
2	AMES	mAP	80	—	Unverified
3	Hypergraph propagation+community selection	mAP	73	—	Unverified
4	Token	mAP	66.57	—	Unverified
5	DELG+ α QE reranking+ RRT reranking	mAP	64	—	Unverified
6	FIRe	mAP	61.2	—	Unverified
7	HOW	mAP	56.9	—	Unverified
8	ResNet101+ArcFace GLDv2-train-clean	mAP	51.6	—	Unverified
9	DELF–HQE+SP	mAP	50.3	—	Unverified
10	HesAff–rSIFT–HQE+SP	mAP	49.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AMES	mAP	90.7	—	Unverified
2	Hypergraph propagation+Community selection	mAP	88.4	—	Unverified
3	Token	mAP	82.28	—	Unverified
4	FIRe	mAP	81.8	—	Unverified
5	DELG+ α QE reranking + RRT reranking	mAP	80.4	—	Unverified
6	HOW	mAP	79.4	—	Unverified
7	ResNet101+ArcFace GLDv2-train-clean	mAP	74.2	—	Unverified
8	DELF–HQE+SP	mAP	73.4	—	Unverified
9	HesAff–rSIFT–HQE+SP	mAP	71.3	—	Unverified
10	DELF–ASMK*+SP	mAP	67.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AMES	mAP	89.7	—	Unverified
2	SuperGlobal	mAP	86.7	—	Unverified
3	Hypergraph propagation	mAP	83.3	—	Unverified
4	Token	mAP	78.56	—	Unverified
5	DELG+ α QE reranking + RRT reranking	mAP	77.7	—	Unverified
6	ResNet101+ArcFace GLDv2-train-clean	mAP	70.3	—	Unverified
7	FIRe	mAP	70	—	Unverified
8	DELF–HQE+SP	mAP	69.3	—	Unverified
9	HOW	mAP	62.4	—	Unverified
10	R–R-MAC	mAP	59.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AMES	mAP	94.9	—	Unverified
2	Hypergraph propagation	mAP	92.6	—	Unverified
3	Token	mAP	89.34	—	Unverified
4	DELG+ α QE reranking + RRT reranking	mAP	88.5	—	Unverified
5	FIRe	mAP	85.3	—	Unverified
6	ResNet101+ArcFace GLDv2-train-clean	mAP	84.9	—	Unverified
7	DELF–HQE+SP	mAP	84	—	Unverified
8	HOW	mAP	81.6	—	Unverified
9	R–R-MAC	mAP	78.9	—	Unverified
10	R–GeM	mAP	77.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Swin-T (MosaiCLIP, CC-12M)	Recall@1 (HN-Atom, UC)	44.5	—	Unverified
2	RN-50 (MosaiCLIP, CC-12M)	Recall@1 (HN-Atom, UC)	44.4	—	Unverified
3	MosaiCLIP (YFCC-FT)	Recall@1 (HN-Atom, UC)	41.5	—	Unverified
4	RN-50 (NegCLIP, CC-12M)	Recall@1 (HN-Atom, UC)	41.4	—	Unverified
5	MosaiCLIP (CC-FT)	Recall@1 (HN-Atom, UC)	40.9	—	Unverified
6	Swin-T (NegCLIP, CC-12M)	Recall@1 (HN-Atom, UC)	39.6	—	Unverified
7	CLIP (YFCC-FT)	Recall@1 (HN-Atom, UC)	39.5	—	Unverified
8	ViT-L-14 (LAION400M)	Recall@1 (HN-Atom + HN-Comp, SC)	39.44	—	Unverified
9	NegCLIP (YFCC-FT)	Recall@1 (HN-Atom, UC)	39	—	Unverified
10	CLIP-FT (YFCC-FT)	Recall@1 (HN-Atom, UC)	38.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DQU-CIR	(Recall@10+Recall@50)/2	71.77	—	Unverified
2	TMCIR	(Recall@10+Recall@50)/2	66.56	—	Unverified
3	SPN4CIR (SPRC)	(Recall@10+Recall@50)/2	66.41	—	Unverified
4	SPRC	(Recall@10+Recall@50)/2	64.85	—	Unverified
5	Candidate Set Re-ranking	(Recall@10+Recall@50)/2	62.15	—	Unverified
6	RUTIR (BLIP B/16)	(Recall@10+Recall@50)/2	61.32	—	Unverified
7	CASE	(Recall@10+Recall@50)/2	59.73	—	Unverified
8	CaLa	(Recall@10+Recall@50)/2	57.96	—	Unverified
9	BLIP4CIR+Bi	(Recall@10+Recall@50)/2	55.4	—	Unverified
10	CLIP4Cir (v3)	(Recall@10+Recall@50)/2	55.36	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	X-VLM (base)	R@1	86.9	—	Unverified
2	RCAR	R@1	62.6	—	Unverified
3	SGRAF	R@1	58.5	—	Unverified
4	LGSGM	R@1	57.4	—	Unverified
5	VisualSparta	R@1	57.4	—	Unverified
6	TERAN MrSw	R@1	56.5	—	Unverified
7	TERAN Symm.	R@1	55.7	—	Unverified
8	VSRN	R@1	54.7	—	Unverified
9	CAMP	R@1	51.5	—	Unverified
10	SCAN i-t	R@1	44	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMCIR	(Recall@5+Recall_subset@1)/2	83.46	—	Unverified
2	SPN4CIR (SPRC)	(Recall@5+Recall_subset@1)/2	82.69	—	Unverified
3	SPRC2	(Recall@5+Recall_subset@1)/2	82.66	—	Unverified
4	SPRC	(Recall@5+Recall_subset@1)/2	81.39	—	Unverified
5	Candidate Set Re-ranking	(Recall@5+Recall_subset@1)/2	80.9	—	Unverified
6	CaLa	(Recall@5+Recall_subset@1)/2	78.74	—	Unverified
7	CASE (Pre-trained on LaSCo.Ca)	(Recall@5+Recall_subset@1)/2	78.25	—	Unverified
8	CASE	(Recall@5+Recall_subset@1)/2	77.5	—	Unverified
9	VISTA (base)	(Recall@5+Recall_subset@1)/2	75.9	—	Unverified
10	MMRet-MLLM	(Recall@5+Recall_subset@1)/2	75.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Unicom+ViT-L@336px	R@1	91.2	—	Unverified
2	ROADMAP (DeiT-B)	R@1	86	—	Unverified
3	CGD (SG/GS)	R@1	84.2	—	Unverified
4	ROADMAP (ResNet-50)	R@1	83.1	—	Unverified
5	ProxyNCA++	R@1	81.4	—	Unverified
6	PNP Loss	R@1	81.1	—	Unverified
7	Cross-Batch Memory	R@1	80.6	—	Unverified
8	Smooth-AP	R@1	80.1	—	Unverified
9	NormSoftmax2048 (ResNet-50)	R@1	79.5	—	Unverified
10	EPSHN512	R@1	78.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	InternVL-G-FT	R@1	85.9	—	Unverified
2	InternVL-C-FT	R@1	85.2	—	Unverified
3	CN-CLIP (ViT-L/14@336px)	R@1	84.4	—	Unverified
4	R2D2 (ViT-L/14)	R@1	84.4	—	Unverified
5	CN-CLIP (ViT-H/14)	R@1	83.8	—	Unverified
6	CN-CLIP (ViT-L/14)	R@1	82.7	—	Unverified
7	CN-CLIP (ViT-B/16)	R@1	79.1	—	Unverified
8	R2D2 (ViT-B)	R@1	78.3	—	Unverified
9	Wukong (ViT-L/14)	R@1	77.4	—	Unverified
10	Wukong (ViT-B/32)	R@1	67.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Offline Diffusion	MAP	96.2	—	Unverified
2	CNN+IME layer	MAP	92	—	Unverified
3	DELF+FT+ATT+DIR+QE	MAP	90	—	Unverified
4	DIR+QE*	MAP	89	—	Unverified