Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering Nov 10, 2019 Natural Questions Open-Domain Question Answering
Code Code Available 1Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations Jun 14, 2023 image-classification Image Classification
Code Code Available 1Language-agnostic BERT Sentence Embedding Jul 3, 2020 Language Modeling Language Modelling
Code Code Available 1Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift Dec 15, 2022 Benchmarking Image Captioning
Code Code Available 1CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Apr 18, 2021 Retrieval Text Retrieval
Code Code Available 1Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 1Learning a Text-Video Embedding from Incomplete and Heterogeneous Data Apr 7, 2018 Retrieval Text Retrieval
Code Code Available 1CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers May 27, 2023 Image Captioning Image Retrieval
Code Code Available 1ESA: External Space Attention Aggregation for Image-Text Retrieval Oct 10, 2023 Image-text Retrieval Retrieval
Code Code Available 1CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision Dec 14, 2021 Contrastive Learning Representation Learning
Code Code Available 1Stacked Cross Attention for Image-Text Matching Mar 21, 2018 Cross-Modal Retrieval Image Retrieval
Code Code Available 1SViTT: Temporal Learning of Sparse Video-Text Transformers Apr 18, 2023 Question Answering Retrieval
Code Code Available 1Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss Sep 9, 2021 Mixture-of-Experts Retrieval
Code Code Available 1IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval Mar 8, 2020 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1ALIP: Adaptive Language-Image Pre-training with Synthetic Caption Aug 16, 2023 Action Classification Image-text Retrieval
Code Code Available 1CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback Jun 19, 2021 Image Retrieval Image-text Retrieval
Code Code Available 1Image-text Retrieval via Preserving Main Semantics of Vision Apr 20, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1Towards Fast and Accurate Image-Text Retrieval with Self-Supervised Fine-Grained Alignment Aug 27, 2023 Contrastive Learning Image-text Retrieval
Code Code Available 1InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Dec 21, 2023 Image Retrieval Image-to-Text Retrieval
Code Code Available 1Understanding Differential Search Index for Text Retrieval May 3, 2023 Information Retrieval Retrieval
Code Code Available 1Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits Feb 12, 2021 CPU Document Ranking
Code Code Available 1COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Oct 27, 2022 Language Modeling Language Modelling
Code Code Available 1GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition Jan 1, 2021 Image-text Retrieval Medical Image Analysis
Code Code Available 1Learning Video Context as Interleaved Multimodal Sequences Jul 31, 2024 Language Modeling Language Modelling
Code Code Available 1MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin Oct 21, 2023 Language Modelling Retrieval
Code Code Available 1Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning Mar 19, 2024 Diagnostic image-classification
Code Code Available 1Hyperbolic Image-Text Representations Apr 18, 2023 image-classification Image Classification
Code Code Available 1ComCLIP: Training-Free Compositional Image and Text Matching Nov 25, 2022 Image-text matching Image-text Retrieval
Code Code Available 1FETA: Towards Specializing Foundation Models for Expert Task Applications Sep 8, 2022 Domain Generalization Few-Shot Learning
Code Code Available 1FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions May 28, 2023 Attribute Image Captioning
Code Code Available 1I0T: Embedding Standardization Method Towards Zero Modality Gap Dec 18, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 1Condenser: a Pre-training Architecture for Dense Retrieval Apr 16, 2021 Language Modelling Retrieval
Code Code Available 1Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning Network Jan 1, 2023 Image-text matching Retrieval
Code Code Available 1Composing Object Relations and Attributes for Image-Text Matching Jun 17, 2024 Attribute Graph Attention
Code Code Available 1Learning Relation Alignment for Calibrated Cross-modal Retrieval May 28, 2021 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1Fine-Tuning LLaMA for Multi-Stage Text Retrieval Oct 12, 2023 Passage Retrieval Retrieval
Code Code Available 1Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 1Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization Dec 30, 2023 Answer Generation Contrastive Learning
Code Code Available 1Continual learning in cross-modal retrieval Apr 14, 2021 Continual Learning cross-modal alignment
— Unverified 0Free-Form Multi-Modal Multimedia Retrieval (4MR) Mar 29, 2023 Form Management
— Unverified 0Context-Aware Attention Network for Image-Text Retrieval Jun 1, 2020 Image-text Retrieval Retrieval
— Unverified 0Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks Aug 13, 2023 Contrastive Learning image-classification
— Unverified 0Constructing Phrase-level Semantic Labels to Form Multi-GrainedSupervision for Image-Text Retrieval Nov 16, 2021 Form Image-text Retrieval
— Unverified 0Attentive Deep Neural Networks for Legal Document Retrieval Dec 13, 2022 Articles Question Answering
— Unverified 0FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations Apr 11, 2025 image-classification Image Classification
— Unverified 0Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval Sep 12, 2021 Form Image-text Retrieval
— Unverified 0FLAP: Fast Language-Audio Pre-training Nov 2, 2023 AudioCaps Contrastive Learning
— Unverified 0Constructing Image-Text Pair Dataset from Books Oct 3, 2023 Image-text Retrieval Optical Character Recognition (OCR)
— Unverified 0Align, Adapt and Inject: Sound-guided Unified Image Generation Jun 20, 2023 Image Generation Retrieval
— Unverified 0How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? Jul 10, 2024 Contrastive Learning Image-text Retrieval
— Unverified 0