Deep Sketched Output Kernel Regression for Structured Prediction Jun 13, 2024 Cross-Modal Retrieval Prediction
Code Code Available 0What If We Recaption Billions of Web Images with LLaMA-3? Jun 12, 2024 Cross-Modal Retrieval Image Generation
— Unverified 0Merlin: A Vision Language Foundation Model for 3D Computed Tomography Jun 10, 2024 3D Semantic Segmentation Computed Tomography (CT)
Code Code Available 3Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language Jun 9, 2024 Contrastive Learning Cross-Modal Retrieval
Code Code Available 2No Captions, No Problem: Captionless 3D-CLIP Alignment with Hard Negatives via CLIP Knowledge and LLMs Jun 4, 2024 3D Classification Cross-Modal Retrieval
— Unverified 0Multi-Modal Generative Embedding Model May 29, 2024 Caption Generation Cross-Modal Retrieval
— Unverified 0CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval May 29, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 1RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval May 28, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Distilling Vision-Language Pretraining for Efficient Cross-Modal Retrieval May 23, 2024 Cross-Modal Retrieval Quantization
— Unverified 0Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models May 23, 2024 Cross-Modal Retrieval Representation Learning
— Unverified 0MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding May 15, 2024 Cross-Modal Retrieval Music Recommendation
— Unverified 0Global–Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image–Text Retrieval May 14, 2024 Cross-Modal Retrieval Cross-Modal Retrieval on RSITMD
— Unverified 0All in One Framework for Multimodal Re-identification in the Wild May 8, 2024 All Cross-Modal Retrieval
— Unverified 0COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval May 7, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models May 2, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment Apr 28, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 23SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting Apr 26, 2024 Cross-Modal Retrieval Retrieval
Code Code Available 0Anchor-aware Deep Metric Learning for Audio-visual Retrieval Apr 21, 2024 Cross-Modal Retrieval Metric Learning
— Unverified 0Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding Apr 20, 2024 Cross-Modal Retrieval Diversity
— Unverified 0Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning Apr 16, 2024 Cross-Modal Retrieval Representation Learning
Code Code Available 0Knowledge-enhanced Visual-Language Pretraining for Computational Pathology Apr 15, 2024 Cross-Modal Retrieval Language Modeling
Code Code Available 1Bridging Vision and Language Spaces with Assignment Prediction Apr 15, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 0Learning with Noisy Correspondence Apr 13, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
— Unverified 0Cross-modal Retrieval with Noisy Correspondence via Consistency Refining and Mining Mar 25, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 1VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition Mar 21, 2024 Cross-modal place recognition Cross-Modal Retrieval
Code Code Available 1A Unified Optimal Transport Framework for Cross-Modal Retrieval with Noisy Labels Mar 20, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Improving Medical Multi-modal Contrastive Learning with Expert Annotations Mar 15, 2024 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation Mar 12, 2024 Cross-Modal Retrieval GPU
Code Code Available 2Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval Mar 8, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 1Large Language Models are In-Context Molecule Learners Mar 7, 2024 Cross-Modal Retrieval In-Context Learning
Code Code Available 2Tri-Modal Motion Retrieval by Learning a Joint Embedding Space Mar 1, 2024 Cross-Modal Retrieval Information Retrieval
— Unverified 0Impression-CLIP: Contrastive Shape-Impression Embedding for Fonts Feb 26, 2024 Cross-Modal Retrieval Retrieval
Code Code Available 0Distinctive Image Captioning: Leveraging Ground Truth Captions in CLIP Guided Reinforcement Learning Feb 21, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 1Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond Feb 16, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment Feb 15, 2024 cross-modal alignment Cross-Modal Retrieval
— Unverified 0Large Language Models for Captioning and Retrieving Remote Sensing Images Feb 9, 2024 Cross-Modal Retrieval Decoder
— Unverified 0Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization Feb 3, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 0Cross-Modal Coordination Across a Diverse Set of Input Modalities Jan 29, 2024 Cross-Modal Retrieval Image Retrieval
— Unverified 0Enhancing medical vision-language contrastive learning via inter-matching relation modelling Jan 19, 2024 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering Jan 15, 2024 Cross-Modal Retrieval Medical Diagnosis
— Unverified 0Cross-modal Retrieval for Knowledge-based Visual Question Answering Jan 11, 2024 Cross-Modal Retrieval Question Answering
Code Code Available 1Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment Jan 1, 2024 cross-modal alignment Cross-Modal Retrieval
Code Code Available 2Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval Jan 1, 2024 Cross-Modal Retrieval Retrieval
— Unverified 0Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation Dec 27, 2023 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
— Unverified 0LeanVec: Searching vectors faster by making them fit Dec 26, 2023 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 2Masked Contrastive Reconstruction for Cross-modal Medical Image-Report Retrieval Dec 26, 2023 Contrastive Learning Cross-Modal Retrieval
— Unverified 0SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing Dec 20, 2023 Attribute Cross-Modal Retrieval
Code Code Available 2TF-CLIP: Learning Text-free CLIP for Video-based Person Re-Identification Dec 15, 2023 Cross-Modal Retrieval Person Re-Identification
Code Code Available 1CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer Dec 14, 2023 Cross-Lingual Transfer Cross-Modal Retrieval
— Unverified 0WikiMuTe: A web-sourced dataset of semantic descriptions for music audio Dec 14, 2023 Articles Cross-Modal Retrieval
— Unverified 0