Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models May 2, 2024 Cross-Modal Retrieval Retrieval
— Unverified 03SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting Apr 26, 2024 Cross-Modal Retrieval Retrieval
Code Code Available 0Anchor-aware Deep Metric Learning for Audio-visual Retrieval Apr 21, 2024 Cross-Modal Retrieval Metric Learning
— Unverified 0Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding Apr 20, 2024 Cross-Modal Retrieval Diversity
— Unverified 0Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning Apr 16, 2024 Cross-Modal Retrieval Representation Learning
Code Code Available 0Bridging Vision and Language Spaces with Assignment Prediction Apr 15, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 0Learning with Noisy Correspondence Apr 13, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
— Unverified 0A Unified Optimal Transport Framework for Cross-Modal Retrieval with Noisy Labels Mar 20, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Improving Medical Multi-modal Contrastive Learning with Expert Annotations Mar 15, 2024 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0Tri-Modal Motion Retrieval by Learning a Joint Embedding Space Mar 1, 2024 Cross-Modal Retrieval Information Retrieval
— Unverified 0Impression-CLIP: Contrastive Shape-Impression Embedding for Fonts Feb 26, 2024 Cross-Modal Retrieval Retrieval
Code Code Available 0Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond Feb 16, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment Feb 15, 2024 cross-modal alignment Cross-Modal Retrieval
— Unverified 0Large Language Models for Captioning and Retrieving Remote Sensing Images Feb 9, 2024 Cross-Modal Retrieval Decoder
— Unverified 0Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization Feb 3, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 0Cross-Modal Coordination Across a Diverse Set of Input Modalities Jan 29, 2024 Cross-Modal Retrieval Image Retrieval
— Unverified 0Enhancing medical vision-language contrastive learning via inter-matching relation modelling Jan 19, 2024 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering Jan 15, 2024 Cross-Modal Retrieval Medical Diagnosis
— Unverified 0Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval Jan 1, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation Dec 27, 2023 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
— Unverified 0Masked Contrastive Reconstruction for Cross-modal Medical Image-Report Retrieval Dec 26, 2023 Contrastive Learning Cross-Modal Retrieval
— Unverified 0CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer Dec 14, 2023 Cross-Lingual Transfer Cross-Modal Retrieval
— Unverified 0WikiMuTe: A web-sourced dataset of semantic descriptions for music audio Dec 14, 2023 Articles Cross-Modal Retrieval
— Unverified 0Uni3DL: Unified Model for 3D and Language Understanding Dec 5, 2023 Cross-Modal Retrieval Instance Segmentation
— Unverified 0T3D: Advancing 3D Medical Vision-Language Pre-training by Learning Multi-View Visual Consistency Dec 3, 2023 Clinical Knowledge Contrastive Learning
— Unverified 0Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images Nov 23, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 0Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search Nov 15, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution Oct 20, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 0Two-Stage Triplet Loss Training with Curriculum Augmentation for Audio-Visual Retrieval Oct 20, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks Oct 17, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 0Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval Oct 12, 2023 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search Sep 28, 2023 cross-modal alignment Cross-Modal Retrieval
Code Code Available 0ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens Sep 28, 2023 Cross-Modal Retrieval GPU
Code Code Available 0Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis Sep 21, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 0Sound Source Localization is All about Cross-Modal Alignment Sep 19, 2023 All cross-modal alignment
— Unverified 0Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval Sep 11, 2023 Cross-Lingual Transfer Cross-Modal Retrieval
— Unverified 0Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval Aug 29, 2023 Cross-Modal Retrieval image-classification
— Unverified 0Extending Cross-Modal Retrieval with Interactive Learning to Improve Image Retrieval Performance in Forensics Aug 28, 2023 Cross-Modal Retrieval Image Retrieval
— Unverified 0Video and Audio are Images: A Cross-Modal Mixer for Original Data on Video-Audio Retrieval Aug 26, 2023 Cross-Modal Retrieval Decoder
— Unverified 0PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting Jul 14, 2023 Cross-Modal Retrieval Image to text
— Unverified 0A scoping review on multimodal deep learning in biomedical images and texts Jul 14, 2023 Cross-Modal Retrieval Decision Making
— Unverified 0Alternative Telescopic Displacement: An Efficient Multimodal Alignment Method Jun 29, 2023 Arrhythmia Detection Cross-Modal Retrieval
Code Code Available 0Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis May 25, 2023 Cross-Modal Retrieval Object
— Unverified 0Continual Vision-Language Representation Learning with Off-Diagonal Information May 11, 2023 Continual Learning Contrastive Learning
— Unverified 0Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal Retriveal May 7, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0Category-Oriented Representation Learning for Image to Multi-Modal Retrieval May 6, 2023 Cross-Modal Retrieval Image Retrieval
— Unverified 0Deep Lifelong Cross-modal Hashing Apr 26, 2023 Cross-Modal Retrieval Lifelong learning
— Unverified 0Sample-Specific Debiasing for Better Image-Text Models Apr 25, 2023 Contrastive Learning Cross-Modal Retrieval
— Unverified 0RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models Apr 21, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 0CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval Apr 15, 2023 cross-modal alignment Cross-Modal Retrieval
— Unverified 0