| Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval | Mar 8, 2022 | Image-text RetrievalInformation Retrieval | CodeCode Available | 1 |
| FILIP: Fine-grained Interactive Language-Image Pre-Training | Nov 9, 2021 | image-classificationImage Classification | CodeCode Available | 1 |
| VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts | Nov 3, 2021 | Image RetrievalImage-text Retrieval | CodeCode Available | 1 |
| Align before Fuse: Vision and Language Representation Learning with Momentum Distillation | Jul 16, 2021 | Cross-Modal RetrievalGrounded language learning | CodeCode Available | 1 |
| Dynamic Modality Interaction Modeling for Image-Text Retrieval | Jul 11, 2021 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 1 |
| CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback | Jun 19, 2021 | Image RetrievalImage-text Retrieval | CodeCode Available | 1 |
| A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval | Jun 4, 2021 | Graph MatchingImage Retrieval | CodeCode Available | 1 |
| Learning Relation Alignment for Calibrated Cross-modal Retrieval | May 28, 2021 | Cross-Modal RetrievalImage-text Retrieval | CodeCode Available | 1 |
| LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval | Mar 16, 2021 | Image-text RetrievalRe-Ranking | CodeCode Available | 1 |
| GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition | Jan 1, 2021 | Image-text RetrievalMedical Image Analysis | CodeCode Available | 1 |