CLIP2Video: Mastering Video-Text Retrieval via Image CLIP Jun 21, 2021 Language Modeling Language Modelling
Code Code Available 1CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Apr 18, 2021 Retrieval Text Retrieval
Code Code Available 1Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval Jul 1, 2020 Contrastive Learning Passage Retrieval
Code Code Available 1CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision Dec 14, 2021 Contrastive Learning Representation Learning
Code Code Available 1LDMol: Text-to-Molecule Diffusion Model with Structurally Informative Latent Space May 28, 2024 Contrastive Learning Decoder
Code Code Available 1Hyperbolic Image-Text Representations Apr 18, 2023 image-classification Image Classification
Code Code Available 1Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone Jun 15, 2022 Described Object Detection Image Captioning
Code Code Available 1Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration May 26, 2024 Information Retrieval Retrieval
Code Code Available 1A Comprehensive Review of the Video-to-Text Problem Mar 27, 2021 Question Answering Retrieval
Code Code Available 1COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Oct 27, 2022 Language Modeling Language Modelling
Code Code Available 1A Survey of Medical Vision-and-Language Applications and Their Techniques Nov 19, 2024 Decision Making Diagnostic
Code Code Available 1HANet: Hierarchical Alignment Networks for Video-Text Retrieval Jul 26, 2021 Retrieval Text Matching
Code Code Available 1ComCLIP: Training-Free Compositional Image and Text Matching Nov 25, 2022 Image-text matching Image-text Retrieval
Code Code Available 1COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark Aug 5, 2024 Dense Video Captioning Diversity
Code Code Available 1Bridging Language Gaps in Audio-Text Retrieval Jun 11, 2024 AudioCaps Retrieval
Code Code Available 1Composing Object Relations and Attributes for Image-Text Matching Jun 17, 2024 Attribute Graph Attention
Code Code Available 1LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval Mar 16, 2021 Image-text Retrieval Re-Ranking
Code Code Available 1Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 1Graph Optimal Transport for Cross-Domain Alignment Jun 26, 2020 Graph Matching Image Captioning
Code Code Available 1Helping Hands: An Object-Aware Ego-Centric Video Recognition Model Aug 15, 2023 Decoder Object
Code Code Available 1I0T: Embedding Standardization Method Towards Zero Modality Gap Dec 18, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 1MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model Oct 11, 2022 Contrastive Learning Image-text matching
Code Code Available 1Bridging Video-text Retrieval with Multiple Choice Questions Jan 13, 2022 Action Recognition Linear evaluation
Code Code Available 1Audio Retrieval with Natural Language Queries: A Benchmark Study Dec 17, 2021 AudioCaps Audio captioning
Code Code Available 1Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding Jun 15, 2023 Contrastive Learning image-classification
Code Code Available 1Contrastive Audio-Language Learning for Music Aug 25, 2022 Audio to Text Retrieval Descriptive
Code Code Available 1Global and Local Semantic Completion Learning for Vision-Language Pre-training Jun 12, 2023 cross-modal alignment Image-text Retrieval
Code Code Available 1GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition Jan 1, 2021 Image-text Retrieval Medical Image Analysis
Code Code Available 1A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval Jun 4, 2021 Graph Matching Image Retrieval
Code Code Available 1GLEN: Generative Retrieval via Lexical Index Learning Nov 6, 2023 Learning-To-Rank Retrieval
Code Code Available 1GOAL: Global-local Object Alignment Learning Mar 22, 2025 Descriptive Object
Code Code Available 1Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 1FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions May 28, 2023 Attribute Image Captioning
Code Code Available 1GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search Dec 30, 2024 RAG Retrieval
Code Code Available 1Generative Multi-hop Retrieval Apr 27, 2022 Decoder GPU
Code Code Available 1Image-text Retrieval via Preserving Main Semantics of Vision Apr 20, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval Aug 1, 2024 Attribute Optical Character Recognition
Code Code Available 1FlexiViT: One Model for All Patch Sizes Dec 15, 2022 All Image-text Retrieval
Code Code Available 1A Data-Centric Framework for Composable NLP Workflows Mar 2, 2021 Retrieval Text Retrieval
Code Code Available 1From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping Apr 26, 2023 Decoder Image Captioning
Code Code Available 1Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Mar 1, 2020 Cross-Modal Retrieval Retrieval
Code Code Available 1CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation Jul 1, 2024 Image-text Retrieval Question Answering
Code Code Available 1FILIP: Fine-grained Interactive Language-Image Pre-Training Nov 9, 2021 image-classification Image Classification
Code Code Available 1Fine-Tuning LLaMA for Multi-Stage Text Retrieval Oct 12, 2023 Passage Retrieval Retrieval
Code Code Available 1DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases Sep 30, 2022 Entity Linking Question Answering
Code Code Available 1Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Jun 1, 2022 Contrastive Learning Cross-Lingual Transfer
Code Code Available 1FETA: Towards Specializing Foundation Models for Expert Task Applications Sep 8, 2022 Domain Generalization Few-Shot Learning
Code Code Available 1Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval Oct 11, 2019 Graph Matching Image-text Retrieval
Code Code Available 1Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning Network Jan 1, 2023 Image-text matching Retrieval
Code Code Available 1Cross-Modal Retrieval with Partially Mismatched Pairs Feb 22, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 1