An analysis of vision-language models for fabric retrieval Jul 7, 2025 Attribute Cross-Modal Retrieval
— Unverified 0Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval Jun 28, 2025 Cross-Modal Retrieval Image Captioning
— Unverified 0Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval Jun 26, 2025 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Multimodal Medical Image Binding via Shared Text Embeddings Jun 22, 2025 Cross-Modal Retrieval Medical Image Analysis
— Unverified 0FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models Jun 12, 2025 Cross-Modal Retrieval Federated Learning
— Unverified 0ContextRefine-CLIP for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2025 Jun 12, 2025 Cross-Modal Retrieval Ensemble Learning
Code Code Available 0SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking May 30, 2025 Cross-Modal Retrieval Person Retrieval
— Unverified 0FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution May 29, 2025 counterfactual Cross-Modal Retrieval
— Unverified 0EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast May 29, 2025 Contrastive Learning cross-modal alignment
— Unverified 0DocMMIR: A Framework for Document Multi-modal Information Retrieval May 25, 2025 Articles Cross-Modal Retrieval
Code Code Available 0Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping May 19, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0SMOTExT: SMOTE meets Large Language Models May 19, 2025 Cross-Modal Retrieval Data Augmentation
Code Code Available 0GMM-Based Comprehensive Feature Extraction and Relative Distance Preservation For Few-Shot Cross-Modal Retrieval May 19, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning May 16, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner May 16, 2025 Cross-Modal Retrieval Diagnostic
Code Code Available 2Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution May 16, 2025 Cross-Modal Retrieval Image to text
— Unverified 0MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment May 14, 2025 Clinical Knowledge Contrastive Learning
Code Code Available 1OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval May 10, 2025 Cross-Modal Retrieval Question Answering
— Unverified 0Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models May 8, 2025 Active Learning cross-modal alignment
Code Code Available 0Disentangling and Generating Modalities for Recommendation in Missing Modality Scenarios Apr 23, 2025 Cross-Modal Retrieval Recommendation Systems
Code Code Available 1Improving Sound Source Localization with Joint Slot Attention on Image and Audio Apr 21, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0The 1st EReL@MIR Workshop on Efficient Representation Learning for Multimodal Information Retrieval Apr 21, 2025 Cross-Modal Retrieval Information Retrieval
— Unverified 0SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs Apr 17, 2025 Cross-Modal Retrieval Image Retrieval
— Unverified 0PATFinger: Prompt-Adapted Transferable Fingerprinting against Unauthorized Multimodal Dataset Usage Apr 15, 2025 Cross-Modal Retrieval Retrieval
— Unverified 0Learning Sparse Disentangled Representations for Multimodal Exclusion Retrieval Apr 4, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs Apr 2, 2025 cross-modal alignment Cross-Modal Retrieval
— Unverified 0LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text Mar 25, 2025 Cross-Modal Retrieval Hallucination
Code Code Available 1Seeing Speech and Sound: Distinguishing and Locating Audios in Visual Scenes Mar 24, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval Mar 20, 2025 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology Mar 19, 2025 Cross-Modal Retrieval Diagnostic
Code Code Available 2NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval Mar 13, 2025 Cross-Modal Retrieval Retrieval
Code Code Available 0Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation Mar 13, 2025 Cross-Modal Retrieval Translation
— Unverified 0Astrea: A MOE-based Visual Understanding Model with Progressive Alignment Mar 12, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0A Recipe for Improving Remote Sensing VLM Zero Shot Generalization Mar 10, 2025 Cross-Modal Retrieval Zero-Shot Cross-Modal Retrieval
— Unverified 0X2CT-CLIP: Enable Multi-Abnormality Detection in Computed Tomography from Chest Radiography via Tri-Modal Contrastive Learning Mar 4, 2025 Anomaly Detection Computed Tomography (CT)
— Unverified 0Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval Mar 3, 2025 Cross-Modal Retrieval Retrieval
Code Code Available 1Composed Multi-modal Retrieval: A Survey of Approaches and Applications Mar 3, 2025 Cross-Modal Retrieval Data Augmentation
Code Code Available 2Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval Feb 27, 2025 Cross-Modal Retrieval Knowledge Distillation
— Unverified 0ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning Feb 27, 2025 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 1Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions Feb 26, 2025 Cross-Modal Retrieval Language Modeling
— Unverified 0On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation Feb 26, 2025 Cross-Modal Retrieval Hallucination
— Unverified 0CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Performance and Training Efficiency Feb 17, 2025 Cross-Modal Retrieval Retrieval
— Unverified 0GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis Feb 13, 2025 Cross-Modal Retrieval Image Captioning
Code Code Available 1Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations Jan 26, 2025 Cross-Modal Retrieval Image Retrieval
— Unverified 0TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval Jan 19, 2025 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Deep Reversible Consistency Learning for Cross-modal Retrieval Jan 10, 2025 Cross-Modal Retrieval Representation Learning
Code Code Available 0Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels Jan 3, 2025 Computational Efficiency Cross-Modal Retrieval
Code Code Available 1Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes Jan 1, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models Jan 1, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval Jan 1, 2025 Cross-Modal Retrieval Retrieval
Code Code Available 1