| Breaking Language Barriers or Reinforcing Bias? A Study of Gender and Racial Disparities in Multilingual Contrastive Vision Language Models | May 20, 2025 | Image-text RetrievalText Retrieval | —Unverified | 0 |
| Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals | Jan 9, 2019 | Cross-Modal RetrievalDeep Hashing | —Unverified | 0 |
| AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models | Oct 7, 2024 | Image CaptioningImage-text Retrieval | —Unverified | 0 |
| Toward Automatic Relevance Judgment using Vision--Language Models for Image--Text Retrieval Evaluation | Aug 2, 2024 | Image-text RetrievalRetrieval | —Unverified | 0 |
| DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions | Feb 7, 2025 | Anomaly DetectionImage-text Retrieval | —Unverified | 0 |
| CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning | Oct 15, 2024 | Image-text RetrievalText Retrieval | —Unverified | 0 |
| Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval | Jun 9, 2024 | Image-text RetrievalPerson Retrieval | —Unverified | 0 |
| Knowledge Transfer Across Modalities with Natural Language Supervision | Nov 23, 2024 | Image-text RetrievalNovel Concepts | —Unverified | 0 |
| How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? | Jul 10, 2024 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval | May 24, 2022 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |