| DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions | Feb 7, 2025 | Anomaly DetectionImage-text Retrieval | —Unverified | 0 | 0 |
| Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals | Jan 9, 2019 | Cross-Modal RetrievalDeep Hashing | —Unverified | 0 | 0 |
| Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval | Oct 12, 2023 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 | 0 |
| VladVA: Discriminative Fine-tuning of LVLMs | Dec 5, 2024 | Image-text RetrievalRepresentation Learning | —Unverified | 0 | 0 |
| Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation | May 25, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 | 0 |
| DLIP: Distilling Language-Image Pre-training | Aug 24, 2023 | Image CaptioningImage-text Retrieval | —Unverified | 0 | 0 |
| Dual Relation Alignment for Composed Image Retrieval | Sep 5, 2023 | Image RetrievalImage-text Retrieval | —Unverified | 0 | 0 |
| Dynamic Contrastive Distillation for Image-Text Retrieval | Jul 4, 2022 | Contrastive LearningGPU | —Unverified | 0 | 0 |
| Efficient Image Captioning for Edge Devices | Dec 18, 2022 | CPUImage Captioning | —Unverified | 0 | 0 |
| Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening | Mar 14, 2023 | Image-text RetrievalMulti-Label Classification | —Unverified | 0 | 0 |