PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks Jan 30, 2023 Crowd Counting Data Augmentation
— Unverified 0Exploring External Knowledge for Accurate modeling of Visual and Language Problems Jan 27, 2023 Image Captioning Machine Translation
— Unverified 0Paraphrase Acquisition from Image Captions Jan 26, 2023 Articles Image Captioning
Code Code Available 0Style-Aware Contrastive Learning for Multi-Style Image Captioning Jan 26, 2023 Contrastive Learning Image Captioning
— Unverified 0Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data Jan 26, 2023 Image Captioning Relational Captioning
— Unverified 0Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation Jan 22, 2023 Common Sense Reasoning Image Captioning
— Unverified 0Exploring the Synergy Between Vision-Language Pretraining and ChatGPT for Artwork Captioning: A Preliminary Study Jan 21, 2023 Image Captioning Informativeness
Code Code Available 0Visual Semantic Relatedness Dataset for Image Captioning Jan 20, 2023 Image Captioning text similarity
Code Code Available 0Towards Models that Can See and Read Jan 18, 2023 Decoder Image Captioning
— Unverified 0Embodied Agents for Efficient Exploration and Smart Scene Description Jan 17, 2023 Efficient Exploration Image Captioning
— Unverified 0An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU) Jan 6, 2023 Decoder Image Captioning
— Unverified 0Adaptively Clustering Neighbor Elements for Image-Text Generation Jan 5, 2023 Clustering Decoder
Code Code Available 0An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation Jan 3, 2023 Image Captioning Machine Translation
— Unverified 0Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks Jan 1, 2023 Cross-Modal Retrieval Image Captioning
— Unverified 0Crossing the Gap: Domain Generalization for Image Captioning Jan 1, 2023 Domain Generalization Image Captioning
— Unverified 0PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 Jan 1, 2023 Image Captioning Question Answering
— Unverified 0On the Interpretability of Attention Networks Dec 30, 2022 Image Captioning
Code Code Available 0Do DALL-E and Flamingo Understand Each Other? Dec 23, 2022 Image Captioning Image Generation
— Unverified 0Transferring General Multimodal Pretrained Models to Text Recognition Dec 19, 2022 Image Captioning Optical Character Recognition (OCR)
— Unverified 0Efficient Image Captioning for Edge Devices Dec 18, 2022 CPU Image Captioning
— Unverified 0Cross-Modal Similarity-Based Curriculum Learning for Image Captioning Dec 14, 2022 Image Captioning Language Modeling
— Unverified 0NLIP: Noise-robust Language-Image Pre-training Dec 14, 2022 Image Captioning Image-text Retrieval
— Unverified 0Cap2Aug: Caption guided Image to Image data Augmentation Dec 11, 2022 Classification Cross-Domain Few-Shot
— Unverified 0REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory Dec 10, 2022 Image Captioning Language Modeling
Code Code Available 0ParsVQA-Caps: A Benchmark for Visual Question Answering and Image Captioning in Persian Dec 7, 2022 Image Captioning Question Answering
— Unverified 0Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning Dec 6, 2022 Image Captioning reinforcement-learning
Code Code Available 0Adaptive Testing of Computer Vision Models Dec 6, 2022 Image Captioning object-detection
Code Code Available 0Dataset vs Reality: Understanding Model Performance from the Perspective of Information Need Dec 6, 2022 Image Captioning Information Retrieval
— Unverified 0Controllable Image Captioning via Prompting Dec 4, 2022 controllable image captioning Image Captioning
— Unverified 0Weakly Supervised Annotations for Multi-modal Greeting Cards Dataset Dec 1, 2022 Image Captioning Image Generation
— Unverified 0Focus! Relevant and Sufficient Context Selection for News Image Captioning Dec 1, 2022 Image Captioning Relation Extraction
— Unverified 0Uncertainty-Aware Image Captioning Nov 30, 2022 Caption Generation Image Captioning
— Unverified 0CLID: Controlled-Length Image Descriptions with Limited Data Nov 27, 2022 controllable image captioning Image Captioning
Code Code Available 0Predictive linguistic cues for fake news: a societal artificial intelligence problem Nov 26, 2022 Attribute Image Captioning
— Unverified 0Can Machines Imitate Humans? Integrative Turing Tests for Vision and Language Demonstrate a Narrowing Gap Nov 23, 2022 Image Captioning object-detection
— Unverified 0Retrieval-Augmented Multimodal Language Modeling Nov 22, 2022 Caption Generation Image Captioning
— Unverified 0A survey on knowledge-enhanced multimodal learning Nov 19, 2022 Conditional Image Generation Factual Visual Question Answering
— Unverified 0An Enhanced Object Detection Model for Scene Graph Generation Nov 18, 2022 Graph Generation Image Captioning
— Unverified 0Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired Nov 17, 2022 Image Captioning
— Unverified 0Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment Nov 14, 2022 Computational Efficiency Image Captioning
— Unverified 0Investigations in Audio Captioning: Addressing Vocabulary Imbalance and Evaluating Suitability of Language-Centric Performance Metrics Nov 12, 2022 Audio captioning Image Captioning
— Unverified 0VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation Transformer with Attention on Attention for Vietnamese image captioning Nov 10, 2022 Image Captioning Vietnamese Image Captioning
— Unverified 0Understanding Cross-modal Interactions in V&L Models that Generate Scene Descriptions Nov 9, 2022 Image Captioning Language Modeling
— Unverified 0Image Caption Generation for Low-Resource Assamese Language Nov 1, 2022 Caption Generation Decoder
— Unverified 0DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention Oct 28, 2022 Image Captioning Language Modeling
— Unverified 0FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning Oct 26, 2022 Cross-Modal Retrieval Decoder
— Unverified 0Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks Oct 26, 2022 Image Captioning Language Modeling
— Unverified 0RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data Oct 23, 2022 Image Captioning Image-text Retrieval
— Unverified 0Image-Text Retrieval with Binary and Continuous Label Supervision Oct 20, 2022 Image Captioning Image-text Retrieval
— Unverified 0Prophet Attention: Predicting Attention with Future Attention for Image Captioning Oct 19, 2022 Image Captioning
— Unverified 0