SOTAVerified

Image Retrieval

Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a given query from a large database. It is often considered a form of fine-grained, instance-level classification. The task is integral to image recognition alongside classification and cross-modal retrieval. By leveraging visual similarity and other criteria, image retrieval enables users to efficiently discover relevant images, making it a crucial tool in applications such as search and recommendation.

Extending CLIP for Category-to-image Retrieval in E-commerce

( Image credit: DELF )

Papers

Showing 451500 of 2239 papers

TitleStatusHype
Targeted Attack for Deep Hashing based RetrievalCode1
Classification is a Strong Baseline for Deep Metric LearningCode0
MABNet: Master Assistant Buddy Network with Hybrid Learning for Image RetrievalCode0
MaMMUT: A Simple Architecture for Joint Learning for MultiModal TasksCode0
Beyond Product Quantization: Deep Progressive Quantization for Image RetrievalCode0
LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric LearningCode0
LowCLIP: Adapting the CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval TaskCode0
LogoNet: a fine-grained network for instance-level logo sketch retrievalCode0
Matchable Image Retrieval by Learning from Surface ReconstructionCode0
Local Features and Visual Words Emerge in ActivationsCode0
Local Features and Visual Words Emerge in ActivationsCode0
Leveraging Unlabeled Data for Crowd Counting by Learning to RankCode0
Lifelong Histopathology Whole Slide Image Retrieval via Distance Consistency RehearsalCode0
Patch-Wise Self-Supervised Visual Representation Learning: A Fine-Grained ApproachCode0
Let's Transfer Transformations of Shared Semantic RepresentationsCode0
Cross-Modality Sub-Image Retrieval using Contrastive Multimodal Image RepresentationsCode0
Benchmarking Vision-Language Contrastive Methods for Medical Representation LearningCode0
Learning to Play Guess Who? and Inventing a Grounded Language as a ConsequenceCode0
Learning to Learn from Web Data through Deep Semantic EmbeddingsCode0
Learning to Minimize the Remainder in Supervised LearningCode0
Learning Self-Regularized Adversarial Views for Self-Supervised Vision TransformersCode0
Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language LearningCode0
Domain-Aware SE Network for Sketch-based Image Retrieval with Multiplicative Euclidean Margin SoftmaxCode0
Learning Metrics from Teachers: Compact Networks for Image EmbeddingCode0
Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric LearningCode0
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual FeaturesCode0
Cross-Media Similarity Evaluation for Web Image Retrieval in the WildCode0
Cross-Modal Coherence for Text-to-Image RetrievalCode0
CrossLocate: Cross-modal Large-scale Visual Geo-Localization in Natural Environments using Rendered ModalitiesCode0
Batch DropBlock Network for Person Re-identification and BeyondCode0
Learning Discriminative and Transformation Covariant Local Feature DetectorsCode0
Adding Cues to Binary Feature Descriptors for Visual Place RecognitionCode0
Cross-Domain Image Matching with Deep Feature MapsCode0
Learning Deep Representations of Fine-grained Visual DescriptionsCode0
Learning with Average Precision: Training Image Retrieval with a Listwise LossCode0
Learning discriminative and transformation covariant local feature detectors.Code0
Cross-dimensional Weighting for Aggregated Deep Convolutional FeaturesCode0
Barcode Annotations for Medical Image Retrieval: A Preliminary InvestigationCode0
CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint MatchingCode0
Learning compact binary descriptors with unsupervised deep neural networksCode0
Learning Deep Local Features With Multiple Dynamic Attentions for Large-Scale Image RetrievalCode0
Learning Disentangled Representations via Mutual Information EstimationCode0
Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image RetrievalCode0
A Zero-Shot Framework for Sketch-based Image RetrievalCode0
Correcting the Triplet Selection Bias for Triplet LossCode0
Automating 3D Dataset Generation with Neural Radiance FieldsCode0
Large Language Models and Multimodal Retrieval for Visual Word Sense DisambiguationCode0
COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-Training for Vision-Language RepresentationCode0
Looking at Outfit to Parse ClothingCode0
Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approachCode0
Show:102550
← PrevPage 10 of 45Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SuperGlobalmAP80.2Unverified
2AMESmAP80Unverified
3Hypergraph propagation+community selectionmAP73Unverified
4TokenmAP66.57Unverified
5DELG+ α QE reranking+ RRT rerankingmAP64Unverified
6FIRemAP61.2Unverified
7HOWmAP56.9Unverified
8ResNet101+ArcFace GLDv2-train-cleanmAP51.6Unverified
9DELF–HQE+SPmAP50.3Unverified
10HesAff–rSIFT–HQE+SPmAP49.7Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP90.7Unverified
2Hypergraph propagation+Community selectionmAP88.4Unverified
3TokenmAP82.28Unverified
4FIRemAP81.8Unverified
5DELG+ α QE reranking + RRT rerankingmAP80.4Unverified
6HOWmAP79.4Unverified
7ResNet101+ArcFace GLDv2-train-cleanmAP74.2Unverified
8DELF–HQE+SPmAP73.4Unverified
9HesAff–rSIFT–HQE+SPmAP71.3Unverified
10DELF–ASMK*+SPmAP67.8Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP89.7Unverified
2SuperGlobalmAP86.7Unverified
3Hypergraph propagationmAP83.3Unverified
4TokenmAP78.56Unverified
5DELG+ α QE reranking + RRT rerankingmAP77.7Unverified
6ResNet101+ArcFace GLDv2-train-cleanmAP70.3Unverified
7FIRemAP70Unverified
8DELF–HQE+SPmAP69.3Unverified
9HOWmAP62.4Unverified
10R–R-MACmAP59.4Unverified
#ModelMetricClaimedVerifiedStatus
1AMESmAP94.9Unverified
2Hypergraph propagationmAP92.6Unverified
3TokenmAP89.34Unverified
4DELG+ α QE reranking + RRT rerankingmAP88.5Unverified
5FIRemAP85.3Unverified
6ResNet101+ArcFace GLDv2-train-cleanmAP84.9Unverified
7DELF–HQE+SPmAP84Unverified
8HOWmAP81.6Unverified
9R–R-MACmAP78.9Unverified
10R–GeMmAP77.2Unverified
#ModelMetricClaimedVerifiedStatus
1Swin-T (MosaiCLIP, CC-12M)Recall@1 (HN-Atom, UC)44.5Unverified
2RN-50 (MosaiCLIP, CC-12M)Recall@1 (HN-Atom, UC)44.4Unverified
3MosaiCLIP (YFCC-FT)Recall@1 (HN-Atom, UC)41.5Unverified
4RN-50 (NegCLIP, CC-12M)Recall@1 (HN-Atom, UC)41.4Unverified
5MosaiCLIP (CC-FT)Recall@1 (HN-Atom, UC)40.9Unverified
6Swin-T (NegCLIP, CC-12M)Recall@1 (HN-Atom, UC)39.6Unverified
7CLIP (YFCC-FT)Recall@1 (HN-Atom, UC)39.5Unverified
8ViT-L-14 (LAION400M)Recall@1 (HN-Atom + HN-Comp, SC)39.44Unverified
9NegCLIP (YFCC-FT)Recall@1 (HN-Atom, UC)39Unverified
10CLIP-FT (YFCC-FT)Recall@1 (HN-Atom, UC)38.3Unverified
#ModelMetricClaimedVerifiedStatus
1DQU-CIR(Recall@10+Recall@50)/271.77Unverified
2TMCIR(Recall@10+Recall@50)/266.56Unverified
3SPN4CIR (SPRC)(Recall@10+Recall@50)/266.41Unverified
4SPRC(Recall@10+Recall@50)/264.85Unverified
5Candidate Set Re-ranking(Recall@10+Recall@50)/262.15Unverified
6RUTIR (BLIP B/16)(Recall@10+Recall@50)/261.32Unverified
7CASE(Recall@10+Recall@50)/259.73Unverified
8CaLa(Recall@10+Recall@50)/257.96Unverified
9BLIP4CIR+Bi(Recall@10+Recall@50)/255.4Unverified
10CLIP4Cir (v3)(Recall@10+Recall@50)/255.36Unverified
#ModelMetricClaimedVerifiedStatus
1X-VLM (base)R@186.9Unverified
2RCARR@162.6Unverified
3SGRAFR@158.5Unverified
4VisualSpartaR@157.4Unverified
5LGSGMR@157.4Unverified
6TERAN MrSwR@156.5Unverified
7TERAN Symm.R@155.7Unverified
8VSRNR@154.7Unverified
9CAMPR@151.5Unverified
10SCAN i-tR@144Unverified
#ModelMetricClaimedVerifiedStatus
1TMCIR(Recall@5+Recall_subset@1)/283.46Unverified
2SPN4CIR (SPRC)(Recall@5+Recall_subset@1)/282.69Unverified
3SPRC2(Recall@5+Recall_subset@1)/282.66Unverified
4SPRC(Recall@5+Recall_subset@1)/281.39Unverified
5Candidate Set Re-ranking(Recall@5+Recall_subset@1)/280.9Unverified
6CaLa(Recall@5+Recall_subset@1)/278.74Unverified
7CASE (Pre-trained on LaSCo.Ca)(Recall@5+Recall_subset@1)/278.25Unverified
8CASE(Recall@5+Recall_subset@1)/277.5Unverified
9VISTA (base)(Recall@5+Recall_subset@1)/275.9Unverified
10MMRet-MLLM(Recall@5+Recall_subset@1)/275.7Unverified
#ModelMetricClaimedVerifiedStatus
1Unicom+ViT-L@336pxR@191.2Unverified
2ROADMAP (DeiT-B)R@186Unverified
3CGD (SG/GS)R@184.2Unverified
4ROADMAP (ResNet-50)R@183.1Unverified
5ProxyNCA++R@181.4Unverified
6PNP LossR@181.1Unverified
7Cross-Batch MemoryR@180.6Unverified
8Smooth-APR@180.1Unverified
9NormSoftmax2048 (ResNet-50)R@179.5Unverified
10EPSHN512R@178.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternVL-G-FTR@185.9Unverified
2InternVL-C-FTR@185.2Unverified
3R2D2 (ViT-L/14)R@184.4Unverified
4CN-CLIP (ViT-L/14@336px)R@184.4Unverified
5CN-CLIP (ViT-H/14)R@183.8Unverified
6CN-CLIP (ViT-L/14)R@182.7Unverified
7CN-CLIP (ViT-B/16)R@179.1Unverified
8R2D2 (ViT-B)R@178.3Unverified
9Wukong (ViT-L/14)R@177.4Unverified
10Wukong (ViT-B/32)R@167.6Unverified
#ModelMetricClaimedVerifiedStatus
1Offline DiffusionMAP96.2Unverified
2CNN+IME layerMAP92Unverified
3DELF+FT+ATT+DIR+QEMAP90Unverified
4DIR+QE*MAP89Unverified