| Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval | Oct 12, 2023 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |
| Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning | Oct 12, 2023 | Image CaptioningImage-text Retrieval | —Unverified | 0 |
| Constructing Image-Text Pair Dataset from Books | Oct 3, 2023 | Image-text RetrievalOptical Character Recognition (OCR) | —Unverified | 0 |
| Dual Relation Alignment for Composed Image Retrieval | Sep 5, 2023 | Image RetrievalImage-text Retrieval | —Unverified | 0 |
| MultiWay-Adapater: Adapting large-scale multi-modal models for scalable image-text retrieval | Sep 4, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 0 |
| Contrastive Feature Masking Open-Vocabulary Vision Transformer | Sep 2, 2023 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| DLIP: Distilling Language-Image Pre-training | Aug 24, 2023 | Image CaptioningImage-text Retrieval | —Unverified | 0 |
| EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE | Aug 23, 2023 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks | Aug 13, 2023 | Contrastive Learningimage-classification | —Unverified | 0 |
| Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP | Jul 18, 2023 | AttributeImage-text Retrieval | —Unverified | 0 |