Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal Retriveal May 7, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0Category-Oriented Representation Learning for Image to Multi-Modal Retrieval May 6, 2023 Cross-Modal Retrieval Image Retrieval
— Unverified 0Deep Lifelong Cross-modal Hashing Apr 26, 2023 Cross-Modal Retrieval Lifelong learning
— Unverified 0Sample-Specific Debiasing for Better Image-Text Models Apr 25, 2023 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Rethinking Benchmarks for Cross-modal Image-text Retrieval Apr 21, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models Apr 21, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 0Image-text Retrieval via Preserving Main Semantics of Vision Apr 20, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 1VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset Apr 17, 2023 Audio captioning Audio-Video Question Answering (AVQA)
Code Code Available 2CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval Apr 15, 2023 cross-modal alignment Cross-Modal Retrieval
— Unverified 0Noisy Correspondence Learning with Meta Similarity Correction Apr 13, 2023 Binary Classification Cross-Modal Retrieval
Code Code Available 1Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval Apr 6, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 0AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation Apr 4, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 3Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples Mar 30, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks Mar 29, 2023 Cross-Modal Retrieval Decoder
Code Code Available 0Plug-and-Play Regulators for Image-Text Matching Mar 23, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 1MXM-CLR: A Unified Framework for Contrastive Learning of Multifold Cross-Modal Representations Mar 20, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0Single-branch Network for Multimodal Training Mar 10, 2023 Cross-Modal Retrieval Retrieval
Code Code Available 1Adversarial Modality Alignment Network for Cross-Modal Molecule Retrieval Mar 8, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 0Cross-modal Retrieval with Improved Graph Convolution Mar 7, 2023 Cross-Modal Retrieval Representation Learning
— Unverified 0FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks Mar 4, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 1Data leakage in cross-modal retrieval training: A case study Feb 23, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0Cross-Modal Retrieval with Partially Mismatched Pairs Feb 22, 2023 Contrastive Learning Cross-Modal Retrieval
Code Code Available 1X-TRA: Improving Chest X-ray Tasks with Cross-Modal Retrieval Augmentation Feb 22, 2023 Cross-Modal Retrieval Retrieval
— Unverified 0VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval Feb 13, 2023 Cross-Modal Information Retrieval Cross-Modal Retrieval
— Unverified 0Distribution Aligned Feature Clustering for Zero-Shot Sketch-Based Image Retrieval Jan 17, 2023 Clustering Cross-Modal Retrieval
— Unverified 0Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks Jan 12, 2023 Cross-Modal Retrieval Open-Ended Question Answering
Code Code Available 0Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study Jan 12, 2023 Cross-Modal Retrieval Object
Code Code Available 0Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images Jan 10, 2023 Autonomous Navigation Cross-Modal Retrieval
— Unverified 0NAPReg: Nouns As Proxies Regularization for Semantically Aware Cross-Modal Embeddings Jan 7, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 0Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification Jan 1, 2023 Cross-Modal Retrieval Person Re-Identification
— Unverified 0Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks Jan 1, 2023 Cross-Modal Retrieval Image Captioning
— Unverified 0Learning Semantic Relationship Among Instances for Image-Text Matching Jan 1, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 1RONO: Robust Discriminative Learning With Noisy Labels for 2D-3D Cross-Modal Retrieval Jan 1, 2023 Cross-Modal Retrieval Learning with noisy labels
Code Code Available 1BagFormer: Better Cross-Modal Retrieval via bag-wise interaction Dec 29, 2022 Cross-Modal Retrieval Retrieval
— Unverified 0Position-guided Text Prompt for Vision-Language Pre-training Dec 19, 2022 Cross-Modal Retrieval Image Captioning
Code Code Available 1Retrieval-based Disentangled Representation Learning with Natural Language Supervision Dec 15, 2022 Cross-Modal Retrieval Disentanglement
— Unverified 0Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing Dec 12, 2022 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Using Multiple Instance Learning to Build Multimodal Representations Dec 11, 2022 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval Dec 8, 2022 Cross-Modal Retrieval Food Recognition
Code Code Available 1A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval Dec 6, 2022 Cross-Modal Retrieval Image-text matching
Code Code Available 1Semantic-Conditional Diffusion Networks for Image Captioning Dec 6, 2022 Cross-Modal Retrieval Decoder
Code Code Available 2Normalized Contrastive Learning for Text-Video Retrieval Nov 30, 2022 Contrastive Learning Cross-Modal Retrieval
Code Code Available 1Improving Cross-Modal Retrieval with Set of Diverse Embeddings Nov 30, 2022 Cross-Modal Retrieval Retrieval
Code Code Available 1VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval Nov 23, 2022 Cross-Modal Retrieval Retrieval
Code Code Available 1X^2-VLM: All-In-One Pre-trained Model For Vision-Language Tasks Nov 22, 2022 All Cross-Modal Retrieval
Code Code Available 2TimbreCLIP: Connecting Timbre to Text and Images Nov 21, 2022 Cross-Modal Retrieval Image Generation
— Unverified 0Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention Nov 21, 2022 Cross-Modal Retrieval Language Modeling
Code Code Available 1AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities Nov 12, 2022 Contrastive Learning Cross-Modal Retrieval
Code Code Available 4Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval Nov 7, 2022 Cross-Modal Retrieval Representation Learning
— Unverified 03D Shape Knowledge Graph for Cross-domain 3D Shape Retrieval Oct 27, 2022 3D Shape Retrieval Cross-Modal Retrieval
— Unverified 0