CLIP2Video: Mastering Video-Text Retrieval via Image CLIP Jun 21, 2021 Language Modeling Language Modelling
Code Code Available 15 CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Apr 18, 2021 Retrieval Text Retrieval
Code Code Available 15 FETA: Towards Specializing Foundation Models for Expert Task Applications Sep 8, 2022 Domain Generalization Few-Shot Learning
Code Code Available 15 A Comprehensive Review of the Video-to-Text Problem Mar 27, 2021 Question Answering Retrieval
Code Code Available 15 Extending Multi-modal Contrastive Representations Oct 13, 2023 3D Object Classification Representation Learning
Code Code Available 15 Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning Mar 19, 2024 Diagnostic image-classification
Code Code Available 15 Bridging Language Gaps in Audio-Text Retrieval Jun 11, 2024 AudioCaps Retrieval
Code Code Available 15 Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration May 26, 2024 Information Retrieval Retrieval
Code Code Available 15 Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits Feb 12, 2021 CPU Document Ranking
Code Code Available 15 COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Oct 27, 2022 Language Modeling Language Modelling
Code Code Available 15 LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval Feb 6, 2023 Image-text Retrieval Retrieval
Code Code Available 15 GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search Dec 30, 2024 RAG Retrieval
Code Code Available 15 ComCLIP: Training-Free Compositional Image and Text Matching Nov 25, 2022 Image-text matching Image-text Retrieval
Code Code Available 15 COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark Aug 5, 2024 Dense Video Captioning Diversity
Code Code Available 15 Bridging Video-text Retrieval with Multiple Choice Questions Jan 13, 2022 Action Recognition Linear evaluation
Code Code Available 15 Composing Object Relations and Attributes for Image-Text Matching Jun 17, 2024 Attribute Graph Attention
Code Code Available 15 Learning Semantic Relationship Among Instances for Image-Text Matching Jan 1, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 15 Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 15 Learning the Best Pooling Strategy for Visual Semantic Embedding Nov 9, 2020 Cross-Modal Information Retrieval Image-text Retrieval
Code Code Available 15 ESA: External Space Attention Aggregation for Image-Text Retrieval Oct 10, 2023 Image-text Retrieval Retrieval
Code Code Available 15 A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval Jun 4, 2021 Graph Matching Image Retrieval
Code Code Available 15 Graph Optimal Transport for Cross-Domain Alignment Jun 26, 2020 Graph Matching Image Captioning
Code Code Available 15 Learning Relation Alignment for Calibrated Cross-modal Retrieval May 28, 2021 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 15 Audio Retrieval with Natural Language Queries: A Benchmark Study Dec 17, 2021 AudioCaps Audio captioning
Code Code Available 15 Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding Jun 15, 2023 Contrastive Learning image-classification
Code Code Available 15 Contrastive Audio-Language Learning for Music Aug 25, 2022 Audio to Text Retrieval Descriptive
Code Code Available 15 Learning to Rank in Generative Retrieval Jun 27, 2023 Learning-To-Rank Passage Ranking
Code Code Available 15 Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 15 Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 15 Learnable Pillar-based Re-ranking for Image-Text Retrieval Apr 25, 2023 Image-text Retrieval Re-Ranking
Code Code Available 15 Equivariant Similarity for Vision-Language Foundation Models Mar 25, 2023 Image-text Retrieval Retrieval
Code Code Available 15 Learning a Text-Video Embedding from Incomplete and Heterogeneous Data Apr 7, 2018 Retrieval Text Retrieval
Code Code Available 15 Less is More: Pretrain a Strong Siamese Encoder for Dense Text Retrieval Using a Weak Decoder Nov 1, 2021 Decoder Language Modeling
Code Code Available 15 LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval Jan 1, 2023 image-classification Image Classification
Code Code Available 15 Language-agnostic BERT Sentence Embedding Jul 3, 2020 Language Modeling Language Modelling
Code Code Available 15 Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering Nov 10, 2019 Natural Questions Open-Domain Question Answering
Code Code Available 15 A Data-Centric Framework for Composable NLP Workflows Mar 2, 2021 Retrieval Text Retrieval
Code Code Available 15 LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval Mar 11, 2022 Contrastive Learning Re-Ranking
Code Code Available 15 Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training Jun 15, 2023 Image-text Retrieval Representation Learning
Code Code Available 15 Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models Jun 10, 2025 Contrastive Learning Image-text matching
Code Code Available 15 Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment Aug 29, 2022 cross-modal alignment Image-text Retrieval
Code Code Available 15 Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Jun 1, 2022 Contrastive Learning Cross-Lingual Transfer
Code Code Available 15 Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling Apr 14, 2021 GPU Re-Ranking
Code Code Available 15 Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval Oct 11, 2019 Graph Matching Image-text Retrieval
Code Code Available 15 Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mar 30, 2021 Image Retrieval Retrieval
Code Code Available 15 Large-Scale Adversarial Training for Vision-and-Language Representation Learning Jun 11, 2020 Image-text Retrieval Question Answering
Code Code Available 15 CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation Jul 1, 2024 Image-text Retrieval Question Answering
Code Code Available 15 Cross-Modal Retrieval with Partially Mismatched Pairs Feb 22, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 15 Cross-Modal Retrieval for Motion and Text via DopTriple Loss May 7, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 15 DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval Jun 10, 2025 Image Captioning Retrieval
Code Code Available 15