CLIP-KD: An Empirical Study of CLIP Model Distillation Jul 24, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 15 FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval May 20, 2020 Cross-Modal Retrieval Retrieval
Code Code Available 15 Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models May 31, 2023 Cross-Modal Retrieval Question Answering
Code Code Available 15 Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval Apr 20, 2022 Cross-Modal Retrieval Retrieval
Code Code Available 15 A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval Jan 8, 2022 Cross-Modal Retrieval Information Retrieval
Code Code Available 15 Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Mar 1, 2020 Cross-Modal Retrieval Retrieval
Code Code Available 15 Learning Cross-Modal Retrieval With Noisy Labels Jun 19, 2021 Cross-Modal Retrieval Retrieval
Code Code Available 15 Single-branch Network for Multimodal Training Mar 10, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 15 StacMR: Scene-Text Aware Cross-Modal Retrieval Dec 8, 2020 Cross-Modal Retrieval Information Retrieval
Code Code Available 15 Visual Semantic Reasoning for Image-Text Matching Sep 6, 2019 Cross-Modal Retrieval Image Retrieval
Code Code Available 15 RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models Apr 21, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 05 A Channel Mix Method for Fine-Grained Cross-Modal Retrieval Aug 26, 2022 Cross-Modal Retrieval Retrieval
Code Code Available 05 CHEF: Cross-modal Hierarchical Embeddings for Food Domain Retrieval Feb 4, 2021 Cross-Modal Retrieval Retrieval
Code Code Available 05 Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings Apr 30, 2018 BIG-bench Machine Learning Cross-Modal Retrieval
Code Code Available 05 CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval Sep 12, 2019 Cross-Modal Retrieval Image Retrieval
Code Code Available 05 Alternative Telescopic Displacement: An Efficient Multimodal Alignment Method Jun 29, 2023 Arrhythmia Detection Cross-Modal Retrieval
Code Code Available 05 Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study Jan 12, 2023 Cross-Modal Retrieval Object
Code Code Available 05 PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval Mar 20, 2025 Contrastive Learning Cross-Modal Retrieval
Code Code Available 05 Bridging Vision and Language Spaces with Assignment Prediction Apr 15, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 05 Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models May 8, 2025 Active Learning cross-modal alignment
Code Code Available 05 Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks Oct 17, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 05 Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions Jun 23, 2016 Cross-Modal Information Retrieval Cross-Modal Retrieval
Code Code Available 05 Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval Apr 6, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 05 OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation Jul 1, 2021 Audio to Text Retrieval Cross-Modal Retrieval
Code Code Available 05 Exploring modality-agnostic representations for music classification Jun 2, 2021 Classification Cross-Modal Retrieval
Code Code Available 05 NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval Mar 13, 2025 Cross-Modal Retrieval Retrieval
Code Code Available 05 Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models Jul 5, 2021 Cross-Modal Retrieval Object Localization
Code Code Available 05 ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training Sep 30, 2022 Computational Efficiency Contrastive Learning
Code Code Available 05 See, Hear, and Read: Deep Aligned Representations Jun 3, 2017 Cross-Modal Retrieval Representation Learning
Code Code Available 05 Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval Sep 30, 2024 Cross-Modal Retrieval Large Language Model
Code Code Available 05 Multilingual Vision-Language Pre-training for the Remote Sensing Domain Oct 30, 2024 Cross-Modal Retrieval image-classification
Code Code Available 05 ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens Sep 28, 2023 Cross-Modal Retrieval GPU
Code Code Available 05 Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task Oct 8, 2019 Cross-Modal Retrieval Image to text
Code Code Available 05 MuLan: A Joint Embedding of Music Audio and Natural Language Aug 26, 2022 Cross-Modal Retrieval Music Tagging
Code Code Available 05 Modality-specific Cross-modal Similarity Measurement with Recurrent Attention Network Aug 16, 2017 Cross-Modal Retrieval Retrieval
Code Code Available 05 Efficient Cross-Modal Retrieval via Deep Binary Hashing and Quantization Feb 15, 2022 Cross-Modal Retrieval Deep Hashing
Code Code Available 05 ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map Jul 17, 2024 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 05 Effective and Efficient Indexing in Cross-Modal Hashing-Based Datasets Apr 30, 2019 Cross-Modal Retrieval Retrieval
Code Code Available 05 Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning Apr 16, 2024 Cross-Modal Retrieval Representation Learning
Code Code Available 05 Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search Sep 28, 2023 cross-modal alignment Cross-Modal Retrieval
Code Code Available 05 Dynamic Adapter with Semantics Disentangling for Cross-lingual Cross-modal Retrieval Dec 18, 2024 Cross-Modal Retrieval Retrieval
Code Code Available 05 MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval May 4, 2018 Cross-Modal Retrieval Retrieval
Code Code Available 05 MXM-CLR: A Unified Framework for Contrastive Learning of Multifold Cross-Modal Representations Mar 20, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 05 COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-Training for Vision-Language Representation Jan 1, 2021 Contrastive Learning Cross-Modal Retrieval
Code Code Available 05 Dual-Path Convolutional Image-Text Embeddings with Instance Loss Nov 15, 2017 Content-Based Image Retrieval Cross-Modal Retrieval
Code Code Available 05 Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search Nov 15, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 05 DocMMIR: A Framework for Document Multi-modal Information Retrieval May 25, 2025 Articles Cross-Modal Retrieval
Code Code Available 05 Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning Aug 1, 2020 Cross-Modal Retrieval Representation Learning
Code Code Available 05 Learning Visual Actions Using Multiple Verb-Only Labels Jul 25, 2019 Action Recognition Cross-Modal Retrieval
Code Code Available 05 Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering Oct 22, 2021 Cross-Modal Retrieval Feature Engineering
Code Code Available 05