| LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models | Dec 1, 2023 | image-classificationImage Classification | —Unverified | 0 |
| How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? | Jul 10, 2024 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval | May 24, 2022 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |
| Image-text Retrieval: A Survey on Recent Research and Development | Mar 28, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval | Dec 16, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| A New Fine-grained Alignment Method for Image-text Matching | Nov 3, 2023 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset | May 25, 2022 | Image CaptioningImage Retrieval | —Unverified | 0 |
| Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction | Mar 16, 2024 | Adversarial RobustnessImage-text Retrieval | —Unverified | 0 |
| Barking Up The Syntactic Tree: Enhancing VLM Training with Syntactic Losses | Dec 11, 2024 | Image-text RetrievalQuestion Answering | —Unverified | 0 |
| Learning to embed semantic similarity for joint image-text retrieval | Oct 7, 2022 | Image-text RetrievalMetric Learning | —Unverified | 0 |