Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning Network Jan 1, 2023 Image-text matching Retrieval
Code Code Available 1Learning Semantic Relationship Among Instances for Image-Text Matching Jan 1, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Multimodal Matching-aware Co-attention Networks with Mutual Knowledge Distillation for Fake News Detection Dec 12, 2022 Fake News Detection Image-text matching
— Unverified 0Uniform Masking Prevails in Vision-Language Pretraining Dec 10, 2022 Image-text matching Language Modeling
— Unverified 0A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval Dec 6, 2022 Cross-Modal Retrieval Image-text matching
Code Code Available 1ComCLIP: Training-Free Compositional Image and Text Matching Nov 25, 2022 Image-text matching Image-text Retrieval
Code Code Available 1Self-supervised vision-language pretraining for Medical visual question answering Nov 24, 2022 Contrastive Learning Image-text matching
Code Code Available 1UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance Oct 28, 2022 Image Generation Image-text matching
— Unverified 0Dissecting Deep Metric Learning Losses for Image-Text Retrieval Oct 21, 2022 Cross-Modal Retrieval Image-text matching
Code Code Available 0Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies? Oct 21, 2022 Image-text matching Language Modeling
Code Code Available 0MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model Oct 11, 2022 Contrastive Learning Image-text matching
Code Code Available 1AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search Oct 10, 2022 Contrastive Learning Image-text matching
— Unverified 0GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training Aug 8, 2022 Image-text matching Language Modeling
Code Code Available 1ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval Jul 29, 2022 Cross-Modal Retrieval Image-text matching
Code Code Available 0Zero-Shot Video Captioning with Evolving Pseudo-Tokens Jul 22, 2022 Image Captioning Image-text matching
Code Code Available 1Don't Stop Learning: Towards Continual Learning for the CLIP Model Jul 19, 2022 Continual Learning Image-text matching
— Unverified 0Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer Jul 5, 2022 Image-text matching Knowledge Distillation
Code Code Available 1GR-GAN: Gradual Refinement Text-to-image Generation May 23, 2022 Generative Adversarial Network Image Generation
Code Code Available 0CCMB: A Large-scale Chinese Cross-modal Benchmark May 8, 2022 image-classification Image Classification
Code Code Available 1Language Models Can See: Plugging Visual Controls in Text Generation May 5, 2022 Image Captioning Image-text matching
Code Code Available 2Declaration-based Prompt Tuning for Visual Question Answering May 5, 2022 Image-text matching Language Modeling
Code Code Available 1Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations Apr 20, 2022 Cross-Modal Retrieval Image Retrieval
— Unverified 0No Token Left Behind: Explainability-Aided Image Classification and Generation Apr 11, 2022 image-classification Image Classification
Code Code Available 1ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO Apr 7, 2022 Image-text matching Text Matching
Code Code Available 1DT2I: Dense Text-to-Image Generation from Region Descriptions Apr 5, 2022 Conditional Image Generation Image Generation
— Unverified 0Two-stream Hierarchical Similarity Reasoning for Image-text Matching Mar 10, 2022 Image-text matching Image to text
— Unverified 0Dual Embodied-Symbolic Concept Representations for Deep Learning Mar 1, 2022 class-incremental learning Class Incremental Learning
— Unverified 0MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning Jan 29, 2022 Image-text matching Language Modeling
Code Code Available 1BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Jan 28, 2022 Image Captioning Image-text matching
Code Code Available 5Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching Jan 18, 2022 Image-text matching Referring Expression
— Unverified 0Probing the Role of Positional Information in Vision-Language Models Jan 16, 2022 Contrastive Learning Image-text matching
— Unverified 0Negative-Aware Attention Framework for Image-Text Matching Jan 1, 2022 Image-text matching Text Matching
Code Code Available 1Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation Dec 10, 2021 Image-text matching Image-text Retrieval
— Unverified 0Embedding Arithmetic of Multimodal Queries for Image Retrieval Dec 6, 2021 Image Retrieval Image-text matching
— Unverified 0DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Dec 2, 2021 Image-text matching Instance Segmentation
Code Code Available 1Learning with Noisy Correspondence for Cross-modal Matching Dec 1, 2021 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
Code Code Available 1UFO: A UniFied TransfOrmer for Vision-Language Representation Learning Nov 19, 2021 Image Captioning Image-text matching
— Unverified 0More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching Nov 16, 2021 Contrastive Learning Image-text matching
— Unverified 0MURAL: Multimodal, Multitask Representations Across Languages Nov 1, 2021 Cross-Modal Retrieval Image-text matching
— Unverified 0Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Oct 6, 2021 Image Captioning Image-text matching
— Unverified 0MURAL: Multimodal, Multitask Retrieval Across Languages Sep 10, 2021 Cross-Modal Retrieval Image-text matching
— Unverified 0Hashing based Efficient Inference for Image-Text Matching Aug 1, 2021 Image-text matching Text Matching
— Unverified 0Align before Fuse: Vision and Language Representation Learning with Momentum Distillation Jul 16, 2021 Cross-Modal Retrieval Grounded language learning
Code Code Available 1A Self-Boosting Framework for Automated Radiographic Report Generation Jun 19, 2021 Image Captioning Image-text matching
— Unverified 0Step-Wise Hierarchical Alignment Network for Image-Text Matching Jun 11, 2021 Image-text matching Text Matching
— Unverified 0A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval Jun 4, 2021 Graph Matching Image Retrieval
Code Code Available 1Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features Jun 1, 2021 Cross-Modal Retrieval Image Retrieval
— Unverified 0More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching May 20, 2021 Contrastive Learning Cross-Modal Retrieval
— Unverified 0VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching May 12, 2021 Image-text matching Referring Expression
— Unverified 0Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching Apr 21, 2021 Image-text matching Text Matching
— Unverified 0