Backdoor Attack on Unpaired Medical Image-Text Foundation Models: A Pilot Study on MedCLIP Jan 1, 2024 Backdoor Attack Contrastive Learning
Code Code Available 0Negative Pre-aware for Noisy Cross-modal Matching Dec 10, 2023 Cross-modal retrieval with noisy correspondence Image-text matching
Code Code Available 1OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization Dec 7, 2023 Adversarial Attack Data Augmentation
— Unverified 0CILF-CIAE: CLIP-driven Image-Language Fusion for Correcting Inverse Age Estimation Dec 4, 2023 Age Estimation Image-text matching
— Unverified 0Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding Nov 30, 2023 Attribute Compositional Zero-Shot Learning
Code Code Available 1Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models Nov 28, 2023 Image Captioning Image-text matching
Code Code Available 1MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts Nov 16, 2023 Binary Classification Descriptive
Code Code Available 1Active Mining Sample Pair Semantics for Image-text Matching Nov 9, 2023 Active Learning Image-text matching
— Unverified 0A New Fine-grained Alignment Method for Image-text Matching Nov 3, 2023 Image-text matching Image-text Retrieval
— Unverified 0Cross-modal Active Complementary Learning with Self-refining Correspondence Oct 26, 2023 Cross-modal retrieval with noisy correspondence Image-text matching
Code Code Available 1Learning Comprehensive Representations with Richer Self for Text-to-Image Person Re-Identification Oct 17, 2023 Image Retrieval Image-text matching
— Unverified 0Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval Sep 29, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 1Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search Sep 28, 2023 cross-modal alignment Cross-Modal Retrieval
Code Code Available 0Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking Sep 15, 2023 Image-text matching Re-Ranking
— Unverified 0Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary Tasks Sep 14, 2023 Image-text matching Sarcasm Detection
Code Code Available 0Towards Better Multi-modal Keyphrase Generation via Visual Entity Enhancement and Multi-granularity Image Noise Filtering Sep 9, 2023 Image Captioning Image-text matching
Code Code Available 0ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation Aug 31, 2023 Image-text matching Language Modeling
— Unverified 0Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval Aug 24, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 1Uniformly Distributed Category Prototype-Guided Vision-Language Framework for Long-Tail Recognition Aug 24, 2023 Attribute Image-text matching
— Unverified 0EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE Aug 23, 2023 Image-text matching Image-text Retrieval
— Unverified 0Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models Aug 18, 2023 Image-text matching Object Localization
— Unverified 0Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination Aug 8, 2023 Image-text matching Representation Learning
Code Code Available 1Grounded Image Text Matching with Mismatched Relation Reasoning Aug 2, 2023 Image-text matching Relation
— Unverified 0A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models Jul 24, 2023 Image Generation Image-text matching
Code Code Available 2Advancing Visual Grounding with Scene Knowledge: Benchmark and Method Jul 21, 2023 Image-text matching Text Matching
Code Code Available 1UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding Jul 3, 2023 Image-text matching Sentence
Code Code Available 1Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark Jun 5, 2023 Attribute Image-text matching
Code Code Available 1Revisiting the Role of Language Priors in Vision-Language Models Jun 2, 2023 Image-text matching Image-text Retrieval
Code Code Available 1Improved Probabilistic Image-Text Representations May 29, 2023 Data Augmentation Image-text matching
Code Code Available 1Are Diffusion Models Vision-And-Language Reasoners? May 25, 2023 Denoising Image Generation
Code Code Available 1LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation May 18, 2023 Attribute Image Generation
Code Code Available 1MALM: Mask Augmentation based Local Matching for Food-Recipe Retrieval May 18, 2023 Image-text matching Retrieval
Code Code Available 0Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners May 18, 2023 Image Generation Image-text matching
Code Code Available 1Probing the Role of Positional Information in Vision-Language Models May 17, 2023 Contrastive Learning Image-text matching
— Unverified 0Scene Text Recognition with Image-Text Matching-guided Dictionary May 8, 2023 Image-text matching Language Modeling
— Unverified 0Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations May 6, 2023 Image-text matching Text Matching
Code Code Available 1Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information May 2, 2023 Bayesian Inference Image-text matching
Code Code Available 0RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models Apr 21, 2023 Cross-Modal Retrieval Image-text matching
Code Code Available 0Multi-Modal Representation Learning with Text-Driven Soft Masks Apr 3, 2023 Contrastive Learning Data Augmentation
— Unverified 0Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation Mar 29, 2023 Image Captioning Image-text matching
Code Code Available 1Integrating Language Guidance Into Image-Text Matching for Correcting False Negatives Mar 24, 2023 Cross-modal retrieval with noisy correspondence Image-text matching
Code Code Available 0Plug-and-Play Regulators for Image-Text Matching Mar 23, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Increasing Textual Context Size Boosts Medical Image-Text Matching Mar 23, 2023 Image-text matching Text Matching
Code Code Available 0BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency Mar 22, 2023 Cross-modal retrieval with noisy correspondence Image-text matching
Code Code Available 1Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval Mar 22, 2023 Image-text matching Language Modeling
Code Code Available 2Refined Vision-Language Modeling for Fine-grained Multi-modal Pre-training Mar 9, 2023 Image-text matching Language Modeling
— Unverified 0Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching Mar 1, 2023 Image-text matching Text Matching
— Unverified 0BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding Feb 25, 2023 Brain Decoding Image Generation
Code Code Available 1VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching Jan 1, 2023 Image-text matching Image-text Retrieval
— Unverified 0Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency Jan 1, 2023 Image Segmentation Image-text matching
— Unverified 0