Sketchformer: Transformer-based Representation for Sketched Structure Feb 24, 2020 Cross-Modal Retrieval Dictionary Learning
Code Code Available 1Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval Feb 24, 2020 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Target-Oriented Deformation of Visual-Semantic Embedding Space Oct 15, 2019 Cross-Modal Retrieval Diversity
Code Code Available 1Visual Semantic Reasoning for Image-Text Matching Sep 6, 2019 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval Jun 11, 2019 Cross-Modal Retrieval Multiple Instance Learning
Code Code Available 1UniVSE: Robust Visual Semantic Embeddings via Structured Semantic Representations Apr 11, 2019 Contrastive Learning Cross-Modal Retrieval
Code Code Available 1Stacked Cross Attention for Image-Text Matching Mar 21, 2018 Cross-Modal Retrieval Image Retrieval
Code Code Available 1VSE++: Improving Visual-Semantic Embeddings with Hard Negatives Jul 18, 2017 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Multi-Label Cross-Modal Retrieval Dec 1, 2015 Cross-Modal Retrieval Retrieval
Code Code Available 1Order-Embeddings of Images and Language Nov 19, 2015 Cross-Modal Retrieval Image Captioning
Code Code Available 1An analysis of vision-language models for fabric retrieval Jul 7, 2025 Attribute Cross-Modal Retrieval
— Unverified 0Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval Jun 28, 2025 Cross-Modal Retrieval Image Captioning
— Unverified 0Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval Jun 26, 2025 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Multimodal Medical Image Binding via Shared Text Embeddings Jun 22, 2025 Cross-Modal Retrieval Medical Image Analysis
— Unverified 0FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models Jun 12, 2025 Cross-Modal Retrieval Federated Learning
— Unverified 0ContextRefine-CLIP for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2025 Jun 12, 2025 Cross-Modal Retrieval Ensemble Learning
Code Code Available 0SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking May 30, 2025 Cross-Modal Retrieval Person Retrieval
— Unverified 0EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast May 29, 2025 Contrastive Learning cross-modal alignment
— Unverified 0FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution May 29, 2025 counterfactual Cross-Modal Retrieval
— Unverified 0DocMMIR: A Framework for Document Multi-modal Information Retrieval May 25, 2025 Articles Cross-Modal Retrieval
Code Code Available 0SMOTExT: SMOTE meets Large Language Models May 19, 2025 Cross-Modal Retrieval Data Augmentation
Code Code Available 0Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping May 19, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0GMM-Based Comprehensive Feature Extraction and Relative Distance Preservation For Few-Shot Cross-Modal Retrieval May 19, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning May 16, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution May 16, 2025 Cross-Modal Retrieval Image to text
— Unverified 0OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval May 10, 2025 Cross-Modal Retrieval Question Answering
— Unverified 0Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models May 8, 2025 Active Learning cross-modal alignment
Code Code Available 0The 1st EReL@MIR Workshop on Efficient Representation Learning for Multimodal Information Retrieval Apr 21, 2025 Cross-Modal Retrieval Information Retrieval
— Unverified 0Improving Sound Source Localization with Joint Slot Attention on Image and Audio Apr 21, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs Apr 17, 2025 Cross-Modal Retrieval Image Retrieval
— Unverified 0PATFinger: Prompt-Adapted Transferable Fingerprinting against Unauthorized Multimodal Dataset Usage Apr 15, 2025 Cross-Modal Retrieval Retrieval
— Unverified 0Learning Sparse Disentangled Representations for Multimodal Exclusion Retrieval Apr 4, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs Apr 2, 2025 cross-modal alignment Cross-Modal Retrieval
— Unverified 0Seeing Speech and Sound: Distinguishing and Locating Audios in Visual Scenes Mar 24, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval Mar 20, 2025 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval Mar 13, 2025 Cross-Modal Retrieval Retrieval
Code Code Available 0Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation Mar 13, 2025 Cross-Modal Retrieval Translation
— Unverified 0Astrea: A MOE-based Visual Understanding Model with Progressive Alignment Mar 12, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0A Recipe for Improving Remote Sensing VLM Zero Shot Generalization Mar 10, 2025 Cross-Modal Retrieval Zero-Shot Cross-Modal Retrieval
— Unverified 0X2CT-CLIP: Enable Multi-Abnormality Detection in Computed Tomography from Chest Radiography via Tri-Modal Contrastive Learning Mar 4, 2025 Anomaly Detection Computed Tomography (CT)
— Unverified 0Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval Feb 27, 2025 Cross-Modal Retrieval Knowledge Distillation
— Unverified 0On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation Feb 26, 2025 Cross-Modal Retrieval Hallucination
— Unverified 0Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions Feb 26, 2025 Cross-Modal Retrieval Language Modeling
— Unverified 0CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Performance and Training Efficiency Feb 17, 2025 Cross-Modal Retrieval Retrieval
— Unverified 0Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations Jan 26, 2025 Cross-Modal Retrieval Image Retrieval
— Unverified 0TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval Jan 19, 2025 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Deep Reversible Consistency Learning for Cross-modal Retrieval Jan 10, 2025 Cross-Modal Retrieval Representation Learning
Code Code Available 0Cross-Modal 3D Representation with Multi-View Images and Point Clouds Jan 1, 2025 Autonomous Driving Cross-Modal Retrieval
— Unverified 0Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models Jan 1, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes Jan 1, 2025 Cross-Modal Retrieval Disentanglement
— Unverified 0