Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks Aug 22, 2022 All Cross-Modal Retrieval
Code Code Available 0DenseCap: Fully Convolutional Localization Networks for Dense Captioning Nov 24, 2015 Dense Captioning Image Captioning
Code Code Available 0Human Attention in Image Captioning: Dataset and Analysis Mar 6, 2019 Image Captioning Image Description
Code Code Available 0A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding May 5, 2023 Articles Image Captioning
Code Code Available 0Multilingual Image Description with Neural Sequence Models Oct 15, 2015 Image Captioning Image Description
Code Code Available 0Image2tweet: Datasets in Hindi and English for Generating Tweets from Images Dec 1, 2021 Image Captioning World Knowledge
Code Code Available 0Semantic Map-based Generation of Navigation Instructions Mar 28, 2024 Image Captioning
Code Code Available 0Multimodal Data Augmentation for Image Captioning using Diffusion Models May 3, 2023 Data Augmentation Image Captioning
Code Code Available 0CLID: Controlled-Length Image Descriptions with Limited Data Nov 27, 2022 controllable image captioning Image Captioning
Code Code Available 0A Hybrid Model for Combining Neural Image Caption and k-Nearest Neighbor Approach for Image Captioning May 9, 2021 Image Captioning regression
Code Code Available 0CLDTracker: A Comprehensive Language Description for Visual Tracking May 29, 2025 Image Captioning Visual Tracking
Code Code Available 0Class-Conditional self-reward mechanism for improved Text-to-Image models May 22, 2024 Image Captioning object-detection
Code Code Available 0ILLUME: Rationalizing Vision-Language Models through Human Interactions Aug 17, 2022 Image Captioning Question Answering
Code Code Available 0Semi-Autoregressive Image Captioning Oct 11, 2021 Decoder Image Captioning
Code Code Available 0Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer Apr 17, 2018 Attribute Image Captioning
Code Code Available 0Multimodal Learning for Hateful Memes Detection Nov 25, 2020 Image Captioning Multimodal Deep Learning
Code Code Available 0Semantic Object Accuracy for Generative Text-to-Image Synthesis Oct 29, 2019 Image Captioning Image Generation
Code Code Available 0ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning Dec 26, 2024 Image Captioning Retrieval
Code Code Available 0Training for Diversity in Image Paragraph Captioning Oct 1, 2018 Diversity Image Captioning
Code Code Available 0Semi-supervised Multimodal Representation Learning through a Global Workspace Jun 27, 2023 Image Captioning Image Generation
Code Code Available 0Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity Sep 7, 2024 Image Captioning Image Retrieval
Code Code Available 0SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text May 18, 2018 Descriptive Image Captioning
Code Code Available 0A High-Quality Text-Rich Image Instruction Tuning Dataset via Hybrid Instruction Generation Dec 20, 2024 Image Captioning
Code Code Available 0CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation Jul 16, 2024 controllable image captioning Data Augmentation
Code Code Available 0ICU: Conquering Language Barriers in Vision-and-Language Modeling by Dividing the Tasks into Image Captioning and Language Understanding Oct 19, 2023 Image Captioning Language Modeling
Code Code Available 0Iconographic Image Captioning for Artworks Feb 7, 2021 Image Captioning
Code Code Available 0Sequence Modeling with Unconstrained Generation Order Nov 1, 2019 Image Captioning Machine Translation
Code Code Available 0Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph Aug 19, 2022 Decoder Image Captioning
Code Code Available 0Adaptively Clustering Neighbor Elements for Image-Text Generation Jan 5, 2023 Clustering Decoder
Code Code Available 0Set Prediction in the Latent Space Dec 1, 2021 Image Captioning object-detection
Code Code Available 0ICECAP: Information Concentrated Entity-aware Image Captioning Aug 4, 2021 Articles Image Captioning
Code Code Available 0Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding Jan 5, 2024 Image Captioning Machine Translation
Code Code Available 0Defoiling Foiled Image Captions May 16, 2018 Descriptive Image Captioning
Code Code Available 0Cascaded Revision Network for Novel Object Captioning Aug 6, 2019 Image Captioning Object
Code Code Available 0Natural Language Object Retrieval Nov 13, 2015 Image Captioning Image Retrieval
Code Code Available 0Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning Feb 4, 2023 Caption Generation Coherence Evaluation
Code Code Available 0How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues Jun 1, 2018 Dialogue State Tracking Image Captioning
Code Code Available 0HICEScore: A Hierarchical Metric for Image Captioning Evaluation Jul 26, 2024 Descriptive Image Captioning
Code Code Available 0Deep Visual-Semantic Alignments for Generating Image Descriptions Dec 7, 2014 Cross-Modal Retrieval Image Captioning
Code Code Available 0Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner May 2, 2017 Image Captioning Sentence
Code Code Available 0DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding Nov 7, 2023 3D Reconstruction Benchmarking
Code Code Available 0Neural Baby Talk Mar 27, 2018 Image Captioning Object
Code Code Available 0Deep Metric Learning Beyond Binary Supervision Apr 21, 2019 Image Captioning Image Retrieval
Code Code Available 0Neural Extractive Summarization with Side Information Apr 14, 2017 Articles Document Summarization
Code Code Available 0HalLoc: Token-level Localization of Hallucinations for Vision Language Models Jun 12, 2025 Hallucination Image Captioning
Code Code Available 0DeepDiary: Automatic Caption Generation for Lifelogging Image Streams Aug 12, 2016 Caption Generation Image Captioning
Code Code Available 0Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions Nov 26, 2018 controllable image captioning Diversity
Code Code Available 0Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data Nov 17, 2015 Image Captioning Novel Concepts
Code Code Available 0Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Dec 20, 2014 8k Image Captioning
Code Code Available 0Adaptively Aligned Image Captioning via Adaptive Attention Time Sep 19, 2019 Decoder Image Captioning
Code Code Available 0