Learning to Generate Grounded Visual Captions without Localization Supervision Jun 1, 2019 Image Captioning Language Modelling
Code Code Available 1Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models Dec 15, 2023 Image Captioning In-Context Learning
Code Code Available 1Analysis of diversity-accuracy tradeoff in image captioning Feb 27, 2020 Diversity Image Captioning
Code Code Available 1Image Captioning In the Transformer Age Apr 15, 2022 Decoder Image Captioning
Code Code Available 1ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning Oct 23, 2024 Image Captioning Instruction Following
Code Code Available 1Image Captioning with Sparse Recurrent Neural Network Aug 28, 2019 Image Captioning Text Generation
Code Code Available 1Automatic Text Evaluation through the Lens of Wasserstein Barycenters Aug 27, 2021 Image Captioning Machine Translation
Code Code Available 1Diverse Image Captioning with Context-Object Split Latent Spaces Nov 2, 2020 Diversity Image Captioning
Code Code Available 1Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation Jun 6, 2020 Decoder Image Captioning
Code Code Available 1AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning Jul 10, 2024 Audio-Visual Captioning Image Captioning
Code Code Available 1Injecting Semantic Concepts into End-to-End Image Captioning Dec 9, 2021 Caption Generation Image Captioning
Code Code Available 1Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering Jun 16, 2023 Image Captioning Question Answering
Code Code Available 1Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning Jan 1, 2025 cross-modal alignment Denoising
Code Code Available 1Differentially Private Representation Learning via Image Captioning Mar 4, 2024 Image Captioning Representation Learning
Code Code Available 1DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 1Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts Oct 31, 2023 Image Captioning Language Modeling
Code Code Available 1An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA Sep 10, 2021 Image Captioning Question Answering
Code Code Available 1Large-Scale Bidirectional Training for Zero-Shot Image Captioning Nov 13, 2022 Image Captioning Keyword Extraction
Code Code Available 1Can Audio Captions Be Evaluated with Image Caption Metrics? Oct 10, 2021 AudioCaps Audio captioning
Code Code Available 1Learning to Generate Grounded Visual Captions without Localization Supervision Aug 1, 2020 Image Captioning Language Modelling
Code Code Available 1Bayesian Attention Modules Oct 20, 2020 Image Captioning Machine Translation
Code Code Available 1Bayesian Recurrent Neural Networks Apr 10, 2017 Image Captioning Language Modelling
Code Code Available 1Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Jul 25, 2017 Image Captioning Visual Question Answering
Code Code Available 1Detecting Hate Speech in Multi-modal Memes Dec 29, 2020 Binary Classification Hate Speech Detection
Code Code Available 1Discovering Autoregressive Orderings with Variational Inference Jan 1, 2021 Code Generation Image Captioning
Code Code Available 1Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Dec 15, 2023 Factual Inconsistency Detection in Chart Captioning Image Captioning
Code Code Available 1Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator Dec 11, 2023 Image Captioning Question Answering
Code Code Available 1Self-supervised Learning from a Multi-view Perspective Jun 10, 2020 Image Captioning Language Modelling
Code Code Available 1A large annotated corpus for learning natural language inference Aug 21, 2015 Image Captioning Natural Language Inference
Code Code Available 1BERTGEN: Multi-task Generation through BERT Jun 7, 2021 Decoder Image Captioning
Code Code Available 1BERTScore: Evaluating Text Generation with BERT Apr 21, 2019 Image Captioning Machine Translation
Code Code Available 1MAGVLT: Masked Generative Vision-and-Language Transformer Mar 21, 2023 Image Captioning Image Generation
Code Code Available 1Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA May 13, 2020 Image Captioning Multi-Label Classification
Code Code Available 1DeeCap: Dynamic Early Exiting for Efficient Image Captioning Jan 1, 2022 Image Captioning Imitation Learning
Code Code Available 1Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning May 9, 2022 Image Captioning Object
Code Code Available 1MemeCap: A Dataset for Captioning and Interpreting Memes May 23, 2023 Image Captioning Meme Captioning
Code Code Available 1Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation Sep 12, 2023 Image Captioning Image Generation
Code Code Available 1Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model Aug 2, 2023 Hallucination Image Captioning
Code Code Available 1Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search May 19, 2022 Decision Making Image Captioning
Code Code Available 1Aesthetically Relevant Image Captioning Nov 25, 2022 Image Captioning Sentence
Code Code Available 1DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training Mar 6, 2023 Decoder Image Captioning
Code Code Available 1DeltaNet:Conditional Medical Report Generation for COVID-19 Diagnosis Nov 12, 2022 COVID-19 Diagnosis Decoder
Code Code Available 1Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model Sep 20, 2024 Image Captioning Panoptic Segmentation
Code Code Available 1Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning Mar 14, 2019 Diversity Image Captioning
Code Code Available 1MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition Jul 27, 2016 Face Recognition Image Captioning
Code Code Available 1Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand Dec 8, 2021 Image Captioning Machine Translation
Code Code Available 1CgT-GAN: CLIP-guided Text GAN for Image Captioning Aug 23, 2023 Image Captioning
Code Code Available 1Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation Mar 29, 2023 Image Captioning Image-text matching
Code Code Available 1Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles Mar 8, 2021 Articles Diagnostic
Code Code Available 1UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Nov 23, 2021 Image Captioning Image Description
Code Code Available 1