Image Captioning through Image Transformer Apr 29, 2020 Image Captioning object-detection
Code Code Available 15 Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training Jan 4, 2024 Descriptive Image Captioning
Code Code Available 15 mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections May 24, 2022 Computational Efficiency cross-modal alignment
Code Code Available 15 Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation Mar 29, 2023 Image Captioning Image-text matching
Code Code Available 15 ConvNet Architecture Search for Spatiotemporal Feature Learning Aug 16, 2017 Action Classification Action Recognition
Code Code Available 15 Convolutional Image Captioning Nov 24, 2017 Image Captioning Text Generation
Code Code Available 15 Analysis of diversity-accuracy tradeoff in image captioning Feb 27, 2020 Diversity Image Captioning
Code Code Available 15 Annotation Order Matters: Recurrent Image Annotator for Arbitrary Length Image Tagging Apr 18, 2016 Image Captioning Machine Translation
Code Code Available 05 Unrestricted Adversarial Examples via Semantic Manipulation Apr 12, 2019 Colorization Image Captioning
Code Code Available 05 mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs Jul 13, 2023 Image Captioning
Code Code Available 05 Machine-in-the-Loop Rewriting for Creative Image Captioning Nov 7, 2021 Descriptive Image Captioning
Code Code Available 05 MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding Jan 11, 2020 Image Captioning Image-text Retrieval
Code Code Available 05 Meshed-Memory Transformer for Image Captioning Dec 17, 2019 Image Captioning Machine Translation
Code Code Available 05 Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video Jun 5, 2015 Gesture Recognition Image Captioning
Code Code Available 05 An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment Oct 8, 2024 Audio captioning Contrastive Learning
Code Code Available 05 Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution Dec 20, 2024 Answer Generation Image Captioning
Code Code Available 05 Aesthetic Attributes Assessment of Images Jul 11, 2019 Attribute Image Captioning
Code Code Available 05 MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation Apr 29, 2025 cross-modal alignment Decoder
Code Code Available 05 An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics May 24, 2023 Image Captioning Negation
Code Code Available 05 LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model May 3, 2024 Image Captioning Instruction Following
Code Code Available 05 LMCap: Few-shot Multilingual Image Captioning by Retrieval Augmented Language Model Prompting May 31, 2023 Decoder Image Captioning
Code Code Available 05 A Neural Compositional Paradigm for Image Captioning Oct 23, 2018 Diversity Image Captioning
Code Code Available 05 LineCap: Line Charts for Data Visualization Captioning Models Jul 15, 2022 Data Visualization Deep Learning
Code Code Available 05 Look and Modify: Modification Networks for Image Captioning Sep 7, 2019 Decoder Image Captioning
Code Code Available 05 Learning to Caption Images through a Lifetime by Asking Questions Dec 1, 2018 Active Learning Image Captioning
Code Code Available 05 Leveraging Human Attention in Novel Object Captioning Aug 19, 2021 Image Captioning Object
Code Code Available 05 Learning to Evaluate Image Captioning Jun 17, 2018 8k Data Augmentation
Code Code Available 05 Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction Sep 5, 2018 GPU Image Captioning
Code Code Available 05 Learning Visually-Grounded Semantics from Contrastive Adversarial Samples Jun 27, 2018 Adversarial Attack Image Captioning
Code Code Available 05 Leveraging image captions for selective whole slide image annotation Jul 8, 2024 Diversity Image Captioning
Code Code Available 05 Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers Apr 21, 2024 Diagnostic Image Captioning
Code Code Available 05 MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities Jul 1, 2022 Image Captioning
Code Code Available 05 Learn from Downstream and Be Yourself in Multimodal Large Language Model Fine-Tuning Nov 17, 2024 Image Captioning Language Modeling
Code Code Available 05 Adversarial Inference for Multi-Sentence Video Description Dec 13, 2018 Diversity Image Captioning
Code Code Available 05 Learning a Deep Embedding Model for Zero-Shot Learning Nov 15, 2016 Image Captioning Sentence
Code Code Available 05 Batch-normalized Recurrent Highway Networks Sep 26, 2018 Image Captioning
Code Code Available 05 Bangla Image Caption Generation through CNN-Transformer based Encoder-Decoder Network Oct 24, 2021 Caption Generation Decoder
Code Code Available 05 An Empirical Study of Language CNN for Image Captioning Dec 21, 2016 Caption Generation Image Captioning
Code Code Available 05 LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation Sep 4, 2021 Caption Generation Image Captioning
Code Code Available 05 Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning Apr 1, 2024 Image Captioning Instruction Following
Code Code Available 05 Language Models as Knowledge Bases for Visual Word Sense Disambiguation Oct 3, 2023 Image Captioning Multiple-choice
Code Code Available 05 BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset May 28, 2022 Image Captioning Machine Translation
Code Code Available 05 Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning Sep 16, 2021 Decoder Image Captioning
Code Code Available 05 Kvasir-VQA: A Text-Image Pair GI Tract Dataset Sep 2, 2024 Image Captioning Image Generation
Code Code Available 05 Language-Driven Region Pointer Advancement for Controllable Image Captioning Nov 30, 2020 controllable image captioning Image Captioning
Code Code Available 05 The Role of Data Curation in Image Captioning May 5, 2023 Few-Shot Learning Image Captioning
Code Code Available 05 Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes Jan 23, 2025 Emotion Classification Image Captioning
Code Code Available 05 JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images Sep 19, 2024 Hallucination Image Captioning
Code Code Available 05 JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts Dec 18, 2024 Action Detection Descriptive
Code Code Available 05 JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models Nov 7, 2023 Image Captioning
Code Code Available 05