mCLIP: Multilingual CLIP via Cross-lingual Transfer Jul 10, 2023 Contrastive Learning Cross-Lingual Transfer
Code Code Available 15 BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping Oct 29, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 15 More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval Mar 25, 2021 All Cross-Modal Retrieval
Code Code Available 15 CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval May 29, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 15 A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval Dec 6, 2022 Cross-Modal Retrieval Image-text matching
Code Code Available 15 Florence: A New Foundation Model for Computer Vision Nov 22, 2021 Action Classification Action Recognition
Code Code Available 15 M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework Sep 9, 2024 Computational Efficiency Cross-Modal Retrieval
Code Code Available 15 Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective Dec 8, 2023 Cross-Modal Retrieval Data Augmentation
Code Code Available 15 A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language Sep 12, 2022 Contrastive Learning Cross-Modal Retrieval
Code Code Available 15 FedCMR: Federated Cross-Modal Retrieval Jul 1, 2021 Cross-Modal Retrieval Federated Learning
Code Code Available 15 LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text Mar 25, 2025 Cross-Modal Retrieval Hallucination
Code Code Available 15 Order-Embeddings of Images and Language Nov 19, 2015 Cross-Modal Retrieval Image Captioning
Code Code Available 15 COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning Nov 1, 2020 Cross-Modal Retrieval Representation Learning
Code Code Available 15 MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment May 14, 2025 Clinical Knowledge Contrastive Learning
Code Code Available 15 Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts Nov 16, 2021 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval Mar 8, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 15 A Survey on Interpretable Cross-modal Reasoning Sep 5, 2023 Cross-Modal Retrieval Decision Making
Code Code Available 15 Learning Tri-modal Embeddings for Zero-Shot Soundscape Mapping Sep 19, 2023 Cross-Modal Retrieval
Code Code Available 15 Learning Relation Alignment for Calibrated Cross-modal Retrieval May 28, 2021 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 15 Learning Cross-Modal Retrieval With Noisy Labels Jun 19, 2021 Cross-Modal Retrieval Retrieval
Code Code Available 15 Learning Semantic Relationship Among Instances for Image-Text Matching Jan 1, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 15 Learning with Noisy Correspondence for Cross-modal Matching Dec 1, 2021 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 15 Multi-Label Cross-Modal Retrieval Dec 1, 2015 Cross-Modal Retrieval Retrieval
Code Code Available 15 Deep Evidential Learning with Noisy Correspondence for Cross-Modal Retrieval Oct 10, 2022 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 15 IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval Mar 8, 2020 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 15 IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents Dec 10, 2024 Cross-Modal Retrieval Image Classification
Code Code Available 15 Improving Cross-Modal Retrieval with Set of Diverse Embeddings Nov 30, 2022 Cross-Modal Retrieval Retrieval
Code Code Available 15 Integrating multi-label contrastive learning with dual adversarial graph neural networks for cross-modal retrieval Jul 5, 2022 Contrastive Learning Cross-Modal Retrieval
Code Code Available 15 Disentangling and Generating Modalities for Recommendation in Missing Modality Scenarios Apr 23, 2025 Cross-Modal Retrieval Recommendation Systems
Code Code Available 15 IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages Jan 27, 2022 Cross-Modal Retrieval Few-Shot Learning
Code Code Available 15 Learning Dual Semantic Relations with Graph Attention for Image-Text Matching Oct 22, 2020 Cross-Modal Retrieval Graph Attention
Code Code Available 15 Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification Aug 4, 2022 Cross-Modal Retrieval Person Re-Identification
Code Code Available 15 Graph Structured Network for Image-Text Matching Apr 1, 2020 Attribute Cross-Modal Retrieval
Code Code Available 15 Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models May 31, 2023 Cross-Modal Retrieval Question Answering
Code Code Available 15 Align before Fuse: Vision and Language Representation Learning with Momentum Distillation Jul 16, 2021 Cross-Modal Retrieval Grounded language learning
Code Code Available 15 Learning to Evaluate Performance of Multi-modal Semantic Localization Sep 14, 2022 Cross-Modal Retrieval Referring Expression
Code Code Available 15 CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching Dec 1, 2020 Computer Security Cross-Modal Retrieval
Code Code Available 15 Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval Jun 22, 2021 Cross-Modal Retrieval Diversity
Code Code Available 15 A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval Oct 27, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 15 Distinctive Image Captioning: Leveraging Ground Truth Captions in CLIP Guided Reinforcement Learning Feb 21, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Cross-modal transformers for infrared and visible image fusion Jun 26, 2023 Cross-Modal Retrieval Depth Estimation
Code Code Available 15 Dual adversarial graph neural networks for multi-label cross-modal retrieval May 18, 2021 Cross-Modal Retrieval Retrieval
Code Code Available 15 Image-text Retrieval via Preserving Main Semantics of Vision Apr 20, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 15 Dynamic Modality Interaction Modeling for Image-Text Retrieval Jul 11, 2021 cross-modal alignment Cross-Modal Retrieval
Code Code Available 15 Knowledge-enhanced Visual-Language Pretraining for Computational Pathology Apr 15, 2024 Cross-Modal Retrieval Language Modeling
Code Code Available 15 Emotion Embedding Spaces for Matching Music to Stories Nov 26, 2021 Cross-Modal Retrieval Metric Learning
Code Code Available 15 BadCM: Invisible Backdoor Attack Against Cross-Modal Learning Oct 3, 2024 Backdoor Attack Cross-Modal Retrieval
Code Code Available 15 End-to-end Knowledge Retrieval with Multi-modal Queries Jun 1, 2023 Benchmarking Cross-Modal Retrieval
Code Code Available 15 COBRA: Contrastive Bi-Modal Representation Algorithm May 7, 2020 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Cross Modal Retrieval with Querybank Normalisation Dec 23, 2021 Cross-Modal Retrieval Metric Learning
Code Code Available 15