Multimodal Data Augmentation for Image Captioning using Diffusion Models May 3, 2023 Data Augmentation Image Captioning
Code Code Available 05 The Role of Data Curation in Image Captioning May 5, 2023 Few-Shot Learning Image Captioning
Code Code Available 05 Adversarial Inference for Multi-Sentence Video Description Dec 13, 2018 Diversity Image Captioning
Code Code Available 05 LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model May 3, 2024 Image Captioning Instruction Following
Code Code Available 05 Batch-normalized Recurrent Highway Networks Sep 26, 2018 Image Captioning
Code Code Available 05 Bangla Image Caption Generation through CNN-Transformer based Encoder-Decoder Network Oct 24, 2021 Caption Generation Decoder
Code Code Available 05 An Empirical Study of Language CNN for Image Captioning Dec 21, 2016 Caption Generation Image Captioning
Code Code Available 05 LMCap: Few-shot Multilingual Image Captioning by Retrieval Augmented Language Model Prompting May 31, 2023 Decoder Image Captioning
Code Code Available 05 Learning to Caption Images through a Lifetime by Asking Questions Dec 1, 2018 Active Learning Image Captioning
Code Code Available 05 BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset May 28, 2022 Image Captioning Machine Translation
Code Code Available 05 Leveraging image captions for selective whole slide image annotation Jul 8, 2024 Diversity Image Captioning
Code Code Available 05 MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding Jan 11, 2020 Image Captioning Image-text Retrieval
Code Code Available 05 Learning Visually-Grounded Semantics from Contrastive Adversarial Samples Jun 27, 2018 Adversarial Attack Image Captioning
Code Code Available 05 Leveraging Human Attention in Novel Object Captioning Aug 19, 2021 Image Captioning Object
Code Code Available 05 LineCap: Line Charts for Data Visualization Captioning Models Jul 15, 2022 Data Visualization Deep Learning
Code Code Available 05 Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning Apr 1, 2024 Image Captioning Instruction Following
Code Code Available 05 Learning a Deep Embedding Model for Zero-Shot Learning Nov 15, 2016 Image Captioning Sentence
Code Code Available 05 LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation Sep 4, 2021 Caption Generation Image Captioning
Code Code Available 05 MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer Oct 10, 2022 Decoder Image Captioning
Code Code Available 05 ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora Aug 2, 2023 Contrastive Learning Diversity
Code Code Available 05 Learn from Downstream and Be Yourself in Multimodal Large Language Model Fine-Tuning Nov 17, 2024 Image Captioning Language Modeling
Code Code Available 05 Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images Apr 25, 2015 Image Captioning Novel Concepts
Code Code Available 05 Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes Jan 23, 2025 Emotion Classification Image Captioning
Code Code Available 05 DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding Nov 7, 2023 3D Reconstruction Benchmarking
Code Code Available 05 An Efficient System for Automatic Map Storytelling -- A Case Study on Historical Maps Oct 21, 2024 Image Captioning
Code Code Available 05 Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning Oct 4, 2022 Image Captioning Sentence
Code Code Available 05 Language-Driven Region Pointer Advancement for Controllable Image Captioning Nov 30, 2020 controllable image captioning Image Captioning
Code Code Available 05 ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis Apr 15, 2024 Descriptive Image Captioning
Code Code Available 05 Kvasir-VQA: A Text-Image Pair GI Tract Dataset Sep 2, 2024 Image Captioning Image Generation
Code Code Available 05 Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning Sep 16, 2021 Decoder Image Captioning
Code Code Available 05 Language Models as Knowledge Bases for Visual Word Sense Disambiguation Oct 3, 2023 Image Captioning Multiple-choice
Code Code Available 05 Learning to Evaluate Image Captioning Jun 17, 2018 8k Data Augmentation
Code Code Available 05 Look and Modify: Modification Networks for Image Captioning Sep 7, 2019 Decoder Image Captioning
Code Code Available 05 Counterfactual Maximum Likelihood Estimation for Training Deep Networks Jun 7, 2021 counterfactual Domain Generalization
Code Code Available 05 Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT Dec 3, 2023 Caption Generation Decoder
Code Code Available 05 Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain) May 26, 2025 Image Captioning
Code Code Available 05 JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts Dec 18, 2024 Action Detection Descriptive
Code Code Available 05 Core Tokensets for Data-efficient Sequential Training of Transformers Oct 8, 2024 Image Captioning image-classification
Code Code Available 05 JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models Nov 7, 2023 Image Captioning
Code Code Available 05 Journalistic Guidelines Aware News Image Captioning Sep 7, 2021 Caption Generation Descriptive
Code Code Available 05 JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images Sep 19, 2024 Hallucination Image Captioning
Code Code Available 05 KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph Sep 17, 2024 cross-modal alignment Image Captioning
Code Code Available 05 Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning Dec 5, 2024 Comment Generation Decoder
Code Code Available 05 Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings Jun 10, 2025 Image Captioning
Code Code Available 05 Automated Image Captioning with CNNs and Transformers Dec 13, 2024 Descriptive Hyperparameter Optimization
Code Code Available 05 iParaphrasing: Extracting Visually Grounded Paraphrases via an Image Jun 12, 2018 Image Captioning Question Answering
Code Code Available 05 InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation Nov 30, 2023 Image Captioning Referring Expression
Code Code Available 05 Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights Jul 16, 2024 Image Captioning Multimodal Reasoning
Code Code Available 05 iPIC-XAI: Improving PIC-XAI for Enhanced Image Captioning Explanation Sep 23, 2023 Image Captioning TAG
Code Code Available 05 Auto-Encoding Scene Graphs for Image Captioning Dec 6, 2018 Decoder Image Captioning
Code Code Available 05