| Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval | Oct 12, 2023 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |
| VladVA: Discriminative Fine-tuning of LVLMs | Dec 5, 2024 | Image-text RetrievalRepresentation Learning | —Unverified | 0 |
| Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation | May 25, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| DLIP: Distilling Language-Image Pre-training | Aug 24, 2023 | Image CaptioningImage-text Retrieval | —Unverified | 0 |
| Dual Relation Alignment for Composed Image Retrieval | Sep 5, 2023 | Image RetrievalImage-text Retrieval | —Unverified | 0 |
| Dynamic Contrastive Distillation for Image-Text Retrieval | Jul 4, 2022 | Contrastive LearningGPU | —Unverified | 0 |
| Efficient Image Captioning for Edge Devices | Dec 18, 2022 | CPUImage Captioning | —Unverified | 0 |
| Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening | Mar 14, 2023 | Image-text RetrievalMulti-Label Classification | —Unverified | 0 |
| Efficient Multilingual Multi-modal Pre-training through Triple Contrastive Loss | Oct 1, 2022 | image-classificationImage Classification | —Unverified | 0 |
| Enhancing Conceptual Understanding in Multimodal Contrastive Learning through Hard Negative Samples | Mar 5, 2024 | Concept AlignmentContrastive Learning | —Unverified | 0 |