Learning to Generate Grounded Visual Captions without Localization Supervision Jun 1, 2019 Image Captioning Language Modelling
Code Code Available 15 Learning Distinct and Representative Styles for Image Captioning Sep 17, 2022 Diversity Image Captioning
Code Code Available 15 Analysis of diversity-accuracy tradeoff in image captioning Feb 27, 2020 Diversity Image Captioning
Code Code Available 15 Length-Controllable Image Captioning Jul 19, 2020 controllable image captioning Decoder
Code Code Available 15 ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning Oct 23, 2024 Image Captioning Instruction Following
Code Code Available 15 LIME: Less Is More for MLLM Evaluation Sep 10, 2024 Image Captioning Question Answering
Code Code Available 15 CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 15 GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features Jul 20, 2022 Image Captioning
Code Code Available 15 Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation Jun 6, 2020 Decoder Image Captioning
Code Code Available 15 AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning Jul 10, 2024 Audio-Visual Captioning Image Captioning
Code Code Available 15 Dense Relational Image Captioning via Multi-task Triple-Stream Networks Oct 8, 2020 Graph Generation Image Captioning
Code Code Available 15 Self-supervised Learning from a Multi-view Perspective Jun 10, 2020 Image Captioning Language Modelling
Code Code Available 15 CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation Aug 29, 2023 Image Captioning Machine Translation
Code Code Available 15 Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA May 13, 2020 Image Captioning Multi-Label Classification
Code Code Available 15 Can images help recognize entities? A study of the role of images for Multimodal NER Oct 23, 2020 Image Captioning named-entity-recognition
Code Code Available 15 Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach Aug 10, 2020 Attribute Image Captioning
Code Code Available 15 An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA Sep 10, 2021 Image Captioning Question Answering
Code Code Available 15 Differentially Private Representation Learning via Image Captioning Mar 4, 2024 Image Captioning Representation Learning
Code Code Available 15 CLIPScore: A Reference-free Evaluation Metric for Image Captioning Apr 18, 2021 Hallucination Pair-wise Detection (1-ref) Hallucination Pair-wise Detection (4-ref)
Code Code Available 15 CNN+CNN: Convolutional Decoders for Image Captioning May 23, 2018 Image Captioning Sentence
Code Code Available 15 Bayesian Attention Modules Oct 20, 2020 Image Captioning Machine Translation
Code Code Available 15 Bayesian Recurrent Neural Networks Apr 10, 2017 Image Captioning Language Modelling
Code Code Available 15 DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 15 Belief Revision based Caption Re-ranker with Visual Semantic Information Sep 16, 2022 Caption Generation Image Captioning
Code Code Available 15 Discovering Autoregressive Orderings with Variational Inference Jan 1, 2021 Code Generation Image Captioning
Code Code Available 15 Discovering Non-monotonic Autoregressive Orderings with Variational Inference Oct 27, 2021 Decoder Image Captioning
Code Code Available 15 CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning Oct 10, 2022 Decoder Denoising
Code Code Available 15 Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning Dec 11, 2024 Attribute Benchmarking
Code Code Available 15 Fooling Contrastive Language-Image Pre-trained Models with CLIPMasterPrints Jul 7, 2023 Image Captioning Image Retrieval
Code Code Available 15 BERTGEN: Multi-task Generation through BERT Jun 7, 2021 Decoder Image Captioning
Code Code Available 15 BERTScore: Evaluating Text Generation with BERT Apr 21, 2019 Image Captioning Machine Translation
Code Code Available 15 COCO-Stuff: Thing and Stuff Classes in Context Dec 12, 2016 Image Captioning Semantic Segmentation
Code Code Available 15 Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models Oct 7, 2016 Diversity Image Captioning
Code Code Available 15 Diverse Image Captioning with Context-Object Split Latent Spaces Nov 2, 2020 Diversity Image Captioning
Code Code Available 15 G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o Dec 18, 2024 Image Captioning Video Captioning
Code Code Available 15 Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Dec 15, 2023 Factual Inconsistency Detection in Chart Captioning Image Captioning
Code Code Available 15 Image Captioning In the Transformer Age Apr 15, 2022 Decoder Image Captioning
Code Code Available 15 Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model Aug 2, 2023 Hallucination Image Captioning
Code Code Available 15 Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search May 19, 2022 Decision Making Image Captioning
Code Code Available 15 Aesthetically Relevant Image Captioning Nov 25, 2022 Image Captioning Sentence
Code Code Available 15 Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 15 MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition Jul 27, 2016 Face Recognition Image Captioning
Code Code Available 15 Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model Sep 20, 2024 Image Captioning Panoptic Segmentation
Code Code Available 15 Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts Oct 31, 2023 Image Captioning Language Modeling
Code Code Available 15 EDSL: An Encoder-Decoder Architecture with Symbol-Level Features for Printed Mathematical Expression Recognition Jul 6, 2020 Decoder Image Captioning
Code Code Available 15 Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand Dec 8, 2021 Image Captioning Machine Translation
Code Code Available 15 Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models Mar 26, 2020 Diversity Image Captioning
Code Code Available 15 Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator Dec 11, 2023 Image Captioning Question Answering
Code Code Available 15 Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection Oct 29, 2023 Anomaly Detection Image Captioning
Code Code Available 15 A large annotated corpus for learning natural language inference Aug 21, 2015 Image Captioning Natural Language Inference
Code Code Available 15