Dual-Level Collaborative Transformer for Image Captioning Jan 16, 2021 Descriptive Image Captioning
Code Code Available 1Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Jan 5, 2025 Image Captioning Image to text
Code Code Available 1A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Oct 16, 2021 Image Captioning Language Modeling
Code Code Available 1EDSL: An Encoder-Decoder Architecture with Symbol-Level Features for Printed Mathematical Expression Recognition Jul 6, 2020 Decoder Image Captioning
Code Code Available 1End-to-End Supermask Pruning: Learning to Prune Image Captioning Models Oct 7, 2021 Decoder Image Captioning
Code Code Available 1GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features Jul 20, 2022 Image Captioning
Code Code Available 1Hard Non-Monotonic Attention for Character-Level Transduction Aug 29, 2018 Hard Attention Image Captioning
Code Code Available 1Harnessing the Power of Large Vision Language Models for Synthetic Image Detection Apr 3, 2024 Image Captioning Synthetic Image Detection
Code Code Available 1Human-like Controllable Image Captioning with Verb-specific Semantic Roles Mar 22, 2021 Caption Generation controllable image captioning
Code Code Available 1IC3: Image Captioning by Committee Consensus Feb 2, 2023 Image Captioning
Code Code Available 1Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives Mar 18, 2025 Image Captioning
Code Code Available 1Image Captioning In the Transformer Age Apr 15, 2022 Decoder Image Captioning
Code Code Available 1Image Captioning with Sparse Recurrent Neural Network Aug 28, 2019 Image Captioning Text Generation
Code Code Available 1CgT-GAN: CLIP-guided Text GAN for Image Captioning Aug 23, 2023 Image Captioning
Code Code Available 1A large annotated corpus for learning natural language inference Aug 21, 2015 Image Captioning Natural Language Inference
Code Code Available 1Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network Dec 13, 2020 Caption Generation Decoder
Code Code Available 1In Defense of Grid Features for Visual Question Answering Jan 10, 2020 Image Captioning Question Answering
Code Code Available 1InfMLLM: A Unified Framework for Visual-Language Tasks Nov 12, 2023 GPU Image Captioning
Code Code Available 1BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues Jul 29, 2024 Image Captioning
Code Code Available 1Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Dec 15, 2023 Factual Inconsistency Detection in Chart Captioning Image Captioning
Code Code Available 1ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models Feb 17, 2024 Earth Observation Image Captioning
Code Code Available 1Brain Captioning: Decoding human brain activity into images and text May 19, 2023 Brain Decoding Depth Estimation
Code Code Available 1Compact Bidirectional Transformer for Image Captioning Jan 6, 2022 Decoder Image Captioning
Code Code Available 1Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts Oct 31, 2023 Image Captioning Language Modeling
Code Code Available 1Large-Scale Bidirectional Training for Zero-Shot Image Captioning Nov 13, 2022 Image Captioning Keyword Extraction
Code Code Available 1Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings Feb 16, 2020 Image Captioning Image Generation
Code Code Available 1Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation Jul 16, 2022 Graph Generation Image Captioning
Code Code Available 1End-to-End Transformer Based Model for Image Captioning Mar 29, 2022 Decoder Image Captioning
Code Code Available 1FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Generation Nov 23, 2024 Anatomy Image Captioning
Code Code Available 1Length-Controllable Image Captioning Jul 19, 2020 controllable image captioning Decoder
Code Code Available 1Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Jul 25, 2017 Image Captioning Visual Question Answering
Code Code Available 1Linearly Mapping from Image to Text Space Sep 30, 2022 Image Captioning Image to text
Code Code Available 1A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions Dec 14, 2023 Image Captioning
Code Code Available 1Disentangled Pre-training for Human-Object Interaction Detection Apr 2, 2024 Action Recognition Decoder
Code Code Available 1Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning Dec 2, 2023 Causal Language Modeling Contrastive Learning
Code Code Available 1Boostlet.js: Image processing plugins for the web via JavaScript injection May 13, 2024 Data Visualization Image Captioning
Code Code Available 1DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval Jun 10, 2025 Image Captioning Retrieval
Code Code Available 1CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation Aug 29, 2023 Image Captioning Machine Translation
Code Code Available 1M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training Jun 4, 2020 Image Captioning Image Retrieval
Code Code Available 1CNN+CNN: Convolutional Decoders for Image Captioning May 23, 2018 Image Captioning Sentence
Code Code Available 1Distinctive Image Captioning: Leveraging Ground Truth Captions in CLIP Guided Reinforcement Learning Feb 21, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 1Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone Jun 15, 2022 Described Object Detection Image Captioning
Code Code Available 1Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 1Contrastive Vision-Language Alignment Makes Efficient Instruction Learner Nov 29, 2023 Contrastive Learning Image Captioning
Code Code Available 1CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 1MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants Dec 17, 2024 Image Captioning Question Answering
Code Code Available 1MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models Jun 17, 2024 Benchmarking Fact Checking
Code Code Available 1COCO-Stuff: Thing and Stuff Classes in Context Dec 12, 2016 Image Captioning Semantic Segmentation
Code Code Available 1Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training Jan 4, 2024 Descriptive Image Captioning
Code Code Available 1DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 1