WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models Mar 22, 2022 Image Captioning Image Generation
— Unverified 0XGPT: Cross-modal Generative Pre-Training for Image Captioning Mar 3, 2020 Data Augmentation Denoising
— Unverified 0Zero-Resource Neural Machine Translation with Multi-Agent Communication Game Feb 9, 2018 Decoder Image Captioning
— Unverified 0Zero-Shot, But at What Cost? Unveiling the Hidden Overhead of MILS's LLM-CLIP Framework for Image Captioning Apr 21, 2025 Image Captioning
— Unverified 0Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment Nov 14, 2022 Computational Efficiency Image Captioning
— Unverified 0Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning Oct 12, 2023 Image Captioning Image-text Retrieval
— Unverified 00/1 Deep Neural Networks via Block Coordinate Descent Jun 19, 2022 10-shot image generation
— Unverified 0Learning to Disambiguate by Asking Discriminative Questions Aug 9, 2017 Benchmarking Image Captioning
— Unverified 0Learning to generalize to new compositions in image understanding Aug 27, 2016 Image Captioning Structured Prediction
— Unverified 0Learning to Guide Decoding for Image Captioning Apr 3, 2018 Attribute Decoder
— Unverified 0Learning to Relate from Captions and Bounding Boxes Dec 1, 2019 Image Captioning Relation Classification
— Unverified 0Learning to Select: A Fully Attentive Approach for Novel Object Captioning Jun 2, 2021 Image Captioning Language Modeling
— Unverified 0Learning Visual-Linguistic Adequacy, Fidelity, and Fluency for Novel Object Captioning Sep 29, 2021 Image Captioning
— Unverified 0Learning Visual Representations with Caption Annotations Aug 4, 2020 Image Captioning Language Modeling
— Unverified 0Learning Word Embeddings for Low-Resource Languages by PU Learning Jun 1, 2018 Document Ranking Image Captioning
— Unverified 0Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding Jan 9, 2024 Image Captioning image-classification
— Unverified 0"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning Jun 1, 2023 Image Captioning Keyword Extraction
— Unverified 0Leveraging Partial Dependency Trees to Control Image Captions Jun 1, 2021 Image Captioning
— Unverified 0Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting Aug 17, 2019 Image Captioning Language Modeling
— Unverified 0Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-Modal Knowledge Transfer Nov 16, 2021 Image Captioning Language Modeling
— Unverified 0Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-modal Knowledge Transfer Mar 14, 2022 Image Captioning Language Modeling
— Unverified 0Lexical Simplification with the Deep Structured Similarity Model Nov 1, 2017 Image Captioning Learning Word Embeddings
— Unverified 0LG-VQ: Language-Guided Codebook Learning May 23, 2024 Image Captioning Image Generation
— Unverified 0Light as Deception: GPT-driven Natural Relighting Against Vision-Language Pre-training Models May 30, 2025 Image Captioning Question Answering
— Unverified 0Lightweight In-Context Tuning for Multimodal Unified Models Oct 8, 2023 Image Captioning In-Context Learning
— Unverified 0Linguistically-aware Attention for Reducing the Semantic-Gap in Vision-Language Tasks Aug 18, 2020 Image Captioning Visual Question Answering (VQA)
— Unverified 0利用图像描述与知识图谱增强表示的视觉问答(Exploiting Image Captions and External Knowledge as Representation Enhancement for Visual Question Answering) Aug 1, 2021 Image Captioning Question Answering
— Unverified 0LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction Apr 1, 2024 Image Captioning Instruction Following
— Unverified 0LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning Jun 17, 2024 Image Captioning Question Answering
— Unverified 0LLM4VG: Large Language Models Evaluation for Video Grounding Dec 21, 2023 Image Captioning Video Grounding
— Unverified 0LLMs Can Check Their Own Results to Mitigate Hallucinations in Traffic Understanding Tasks Sep 19, 2024 Autonomous Driving Hallucination
— Unverified 0LocCa: Visual Pretraining with Location-aware Captioners Mar 28, 2024 Decoder Image Captioning
— Unverified 0Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning" May 30, 2021 Avg Decoder
— Unverified 0Long-Tail Classification for Distinctive Image Captioning: A Simple yet Effective Remedy for Side Effects of Reinforcement Learning Jan 16, 2022 Image Captioning Reinforcement Learning (RL)
— Unverified 0Look Back and Predict Forward in Image Captioning Jun 1, 2019 Decoder Image Captioning
— Unverified 0Look Deeper See Richer: Depth-aware Image Paragraph Captioning Oct 15, 2018 Decoder Image Captioning
— Unverified 0LookupViT: Compressing visual information to a limited number of tokens Jul 17, 2024 Image Captioning image-classification
— Unverified 0Lost in Translation: When GPT-4V(ision) Can't See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond Oct 19, 2023 Image Captioning Language Modeling
— Unverified 0LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation Apr 15, 2025 Image Captioning Question Answering
— Unverified 0Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects Dec 8, 2023 Image Captioning object-detection
— Unverified 0M3D-GAN: Multi-Modal Multi-Domain Translation with Universal Attention Jul 9, 2019 Dialogue Generation Image Captioning
— Unverified 0Macroscopic Control of Text Generation for Image Captioning Jan 20, 2021 Diversity Image Captioning
— Unverified 0MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-based Image Captioning Dec 13, 2021 Caption Generation Descriptive
— Unverified 0MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level Jun 6, 2020 Attribute Image Captioning
— Unverified 0Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime May 3, 2023 Image Captioning Question Answering
— Unverified 0Making Use of Latent Space in Language GANs for Generating Diverse Text without Pre-training Apr 1, 2021 Diversity Image Captioning
— Unverified 0MAMI: Multi-Attentional Mutual-Information for Long Sequence Neuron Captioning Jan 5, 2024 Decoder Image Captioning
— Unverified 0Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets Nov 21, 2015 Image Captioning
— Unverified 0Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval Jun 28, 2025 Cross-Modal Retrieval Image Captioning
— Unverified 0Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations Mar 29, 2023 Image Captioning Instance Segmentation
— Unverified 0