| CCMB: A Large-scale Chinese Cross-modal Benchmark | May 8, 2022 | image-classificationImage Classification | CodeCode Available | 1 |
| Progressive Learning for Image Retrieval with Hybrid-Modality Queries | Apr 24, 2022 | Image RetrievalImage-text Retrieval | —Unverified | 0 |
| COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval | Apr 15, 2022 | Contrastive LearningCross-Modal Retrieval | —Unverified | 0 |
| Robust Cross-Modal Representation Learning with Progressive Self-Distillation | Apr 10, 2022 | Contrastive LearningImage Captioning | —Unverified | 0 |
| Image-text Retrieval: A Survey on Recent Research and Development | Mar 28, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| Single-Stream Multi-Level Alignment for Vision-Language Pretraining | Mar 27, 2022 | Image-text RetrievalQuestion Answering | CodeCode Available | 0 |
| LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval | Mar 10, 2022 | Image-text RetrievalRetrieval | —Unverified | 0 |
| Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval | Mar 8, 2022 | Image-text RetrievalInformation Retrieval | CodeCode Available | 1 |
| An Unsupervised Cross-Modal Hashing Method Robust to Noisy Training Image-Text Correspondences in Remote Sensing | Feb 26, 2022 | Image-text RetrievalMeta-Learning | CodeCode Available | 0 |
| Vision-Language Pre-Training with Triple Contrastive Learning | Feb 21, 2022 | Contrastive Learningcross-modal alignment | CodeCode Available | 2 |