| Prompt-based Learning for Unpaired Image Captioning | May 26, 2022 | Image CaptioningImage-text Retrieval | —Unverified | 0 |
| Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset | May 25, 2022 | Image CaptioningImage Retrieval | —Unverified | 0 |
| HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval | May 24, 2022 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |
| Progressive Learning for Image Retrieval with Hybrid-Modality Queries | Apr 24, 2022 | Image RetrievalImage-text Retrieval | —Unverified | 0 |
| COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval | Apr 15, 2022 | Contrastive LearningCross-Modal Retrieval | —Unverified | 0 |
| Robust Cross-Modal Representation Learning with Progressive Self-Distillation | Apr 10, 2022 | Contrastive LearningImage Captioning | —Unverified | 0 |
| Image-text Retrieval: A Survey on Recent Research and Development | Mar 28, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| Single-Stream Multi-Level Alignment for Vision-Language Pretraining | Mar 27, 2022 | Image-text RetrievalQuestion Answering | CodeCode Available | 0 |
| LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval | Mar 10, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| An Unsupervised Cross-Modal Hashing Method Robust to Noisy Training Image-Text Correspondences in Remote Sensing | Feb 26, 2022 | Image-text RetrievalMeta-Learning | CodeCode Available | 0 |