PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language Model Jan 21, 2025 Hallucination Image Captioning
Code Code Available 1Bi-LORA: A Vision-Language Approach for Synthetic Image Detection Apr 2, 2024 Binary Classification Image Captioning
Code Code Available 1RSGPT: A Remote Sensing Vision Language Model and Benchmark Jul 28, 2023 Image Captioning Language Modeling
Code Code Available 1FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Generation Nov 23, 2024 Anatomy Image Captioning
Code Code Available 1FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model Jun 10, 2024 Image Captioning
Code Code Available 1FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks Mar 4, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 1Can We Talk Models Into Seeing the World Differently? Mar 14, 2024 Image Captioning Image Classification
Code Code Available 1Chart-to-Text: A Large-Scale Benchmark for Chart Summarization Mar 12, 2022 Data-to-Text Generation Image Captioning
Code Code Available 1Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards Aug 6, 2020 Attribute Image Captioning
Code Code Available 1Exploring Diverse In-Context Configurations for Image Captioning May 24, 2023 Image Captioning In-Context Learning
Code Code Available 1Exchanging-based Multimodal Fusion with Transformer Sep 5, 2023 Image Captioning Image Generation
Code Code Available 1Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning May 31, 2022 Common Sense Reasoning Graph Generation
Code Code Available 1CgT-GAN: CLIP-guided Text GAN for Image Captioning Aug 23, 2023 Image Captioning
Code Code Available 1CaMEL: Mean Teacher Learning for Image Captioning Feb 21, 2022 Image Captioning Knowledge Distillation
Code Code Available 1ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models Feb 17, 2024 Earth Observation Image Captioning
Code Code Available 1Evolving Deep Neural Networks Mar 1, 2017 Deep Learning Image Captioning
Code Code Available 1FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing May 27, 2023 Graph Similarity Human Judgment Correlation
Code Code Available 1From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping Apr 26, 2023 Decoder Image Captioning
Code Code Available 1Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 1EDSL: An Encoder-Decoder Architecture with Symbol-Level Features for Printed Mathematical Expression Recognition Jul 6, 2020 Decoder Image Captioning
Code Code Available 1Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts Apr 12, 2024 Image Captioning Question Answering
Code Code Available 1Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation Jul 16, 2022 Graph Generation Image Captioning
Code Code Available 1Distinctive Image Captioning: Leveraging Ground Truth Captions in CLIP Guided Reinforcement Learning Feb 21, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 1Diverse Image Captioning with Context-Object Split Latent Spaces Nov 2, 2020 Diversity Image Captioning
Code Code Available 1Dual-Level Collaborative Transformer for Image Captioning Jan 16, 2021 Descriptive Image Captioning
Code Code Available 1ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation Dec 31, 2021 Image Captioning Image Generation
Code Code Available 1Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models May 15, 2023 3D Object Detection Image Captioning
Code Code Available 1Discovering Autoregressive Orderings with Variational Inference Jan 1, 2021 Code Generation Image Captioning
Code Code Available 1Discovering Non-monotonic Autoregressive Orderings with Variational Inference Oct 27, 2021 Decoder Image Captioning
Code Code Available 1Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models Oct 7, 2016 Diversity Image Captioning
Code Code Available 1Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Dec 15, 2023 Factual Inconsistency Detection in Chart Captioning Image Captioning
Code Code Available 1Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning Jan 1, 2025 cross-modal alignment Denoising
Code Code Available 1A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Oct 16, 2021 Image Captioning Language Modeling
Code Code Available 1Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models Mar 26, 2020 Diversity Image Captioning
Code Code Available 1End-to-End Supermask Pruning: Learning to Prune Image Captioning Models Oct 7, 2021 Decoder Image Captioning
Code Code Available 1End-to-End Transformer Based Model for Image Captioning Mar 29, 2022 Decoder Image Captioning
Code Code Available 1DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 1Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift Dec 15, 2022 Benchmarking Image Captioning
Code Code Available 1ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning Feb 11, 2022 Image Captioning Relation
Code Code Available 1Can Audio Captions Be Evaluated with Image Caption Metrics? Oct 10, 2021 AudioCaps Audio captioning
Code Code Available 1DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval Jun 10, 2025 Image Captioning Retrieval
Code Code Available 1Are scene graphs good enough to improve Image Captioning? Sep 25, 2020 Decoder Graph Attention
Code Code Available 1Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning Aug 13, 2022 Image Captioning
Code Code Available 1Exploring Discrete Diffusion Models for Image Captioning Nov 21, 2022 Image Captioning Image Generation
Code Code Available 1BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues Jul 29, 2024 Image Captioning
Code Code Available 1Detecting and Recovering Sequential DeepFake Manipulation Jul 5, 2022 DeepFake Detection Face Swapping
Code Code Available 1CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages Oct 20, 2023 Diversity GPU
Code Code Available 1Detecting Hate Speech in Multi-modal Memes Dec 29, 2020 Binary Classification Hate Speech Detection
Code Code Available 1Brain Captioning: Decoding human brain activity into images and text May 19, 2023 Brain Decoding Depth Estimation
Code Code Available 1Dense Relational Image Captioning via Multi-task Triple-Stream Networks Oct 8, 2020 Graph Generation Image Captioning
Code Code Available 1