DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training Mar 6, 2023 Decoder Image Captioning
Code Code Available 15 Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning Dec 2, 2023 Causal Language Modeling Contrastive Learning
Code Code Available 15 Boostlet.js: Image processing plugins for the web via JavaScript injection May 13, 2024 Data Visualization Image Captioning
Code Code Available 15 FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Generation Nov 23, 2024 Anatomy Image Captioning
Code Code Available 15 Length-Controllable Image Captioning Jul 19, 2020 controllable image captioning Decoder
Code Code Available 15 Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 15 FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model Jun 10, 2024 Image Captioning
Code Code Available 15 PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language Model Jan 21, 2025 Hallucination Image Captioning
Code Code Available 15 UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Nov 23, 2021 Image Captioning Image Description
Code Code Available 15 RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection May 30, 2024 Image Captioning Image Inpainting
Code Code Available 15 Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO Apr 30, 2020 Image Captioning Representation Learning
Code Code Available 15 Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs Mar 1, 2020 Attribute Caption Generation
Code Code Available 15 CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers May 27, 2023 Image Captioning Image Retrieval
Code Code Available 15 CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features May 13, 2019 Domain Generalization Image Captioning
Code Code Available 15 Comprehensive Image Captioning via Scene Graph Decomposition Jul 23, 2020 Diversity Image Captioning
Code Code Available 15 Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning Oct 4, 2021 Hallucination Image Captioning
Code Code Available 15 Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings Feb 16, 2020 Image Captioning Image Generation
Code Code Available 15 FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context Mar 4, 2022 Decoder Image Captioning
Code Code Available 15 Concadia: Towards Image-Based Text Generation with a Purpose Apr 16, 2021 Image Captioning Image to text
Code Code Available 15 Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Feb 17, 2021 Caption Generation Diversity
Code Code Available 15 Self-critical Sequence Training for Image Captioning Dec 2, 2016 Image Captioning Policy Gradient Methods
Code Code Available 15 Confidence-aware Non-repetitive Multimodal Transformers for TextCaps Dec 7, 2020 Image Captioning Optical Character Recognition
Code Code Available 15 COSMic: A Coherence-Aware Generation Metric for Image Descriptions Sep 11, 2021 Caption Generation Image Captioning
Code Code Available 15 GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis Feb 13, 2025 Cross-Modal Retrieval Image Captioning
Code Code Available 15 LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport Jan 16, 2025 AudioCaps Audio captioning
Code Code Available 15 Connecting What to Say With Where to Look by Modeling Human Attention Traces May 12, 2021 Caption Generation Image Captioning
Code Code Available 15 Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 15 Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Jan 5, 2025 Image Captioning Image to text
Code Code Available 15 ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing Mar 4, 2023 Diversity Image Captioning
Code Code Available 15 Show, Deconfound and Tell: Image Captioning With Causal Inference Jan 1, 2022 Causal Inference Decoder
Code Code Available 15 Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator Dec 11, 2023 Image Captioning Question Answering
Code Code Available 15 Sieve: Multimodal Dataset Pruning Using Image Captioning Models Oct 3, 2023 Diversity Image Captioning
Code Code Available 15 Learning Distinct and Representative Styles for Image Captioning Sep 17, 2022 Diversity Image Captioning
Code Code Available 15 ConvNet Architecture Search for Spatiotemporal Feature Learning Aug 16, 2017 Action Classification Action Recognition
Code Code Available 15 Detecting and Recovering Sequential DeepFake Manipulation Jul 5, 2022 DeepFake Detection Face Swapping
Code Code Available 15 Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models May 29, 2023 Image Captioning Image Classification
Code Code Available 15 Convolutional Image Captioning Nov 24, 2017 Image Captioning Text Generation
Code Code Available 15 Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models May 24, 2023 document understanding Image Captioning
Code Code Available 15 End-to-End Transformer Based Model for Image Captioning Mar 29, 2022 Decoder Image Captioning
Code Code Available 15 Graph Optimal Transport for Cross-Domain Alignment Jun 26, 2020 Graph Matching Image Captioning
Code Code Available 15 Learning to Generate Grounded Visual Captions without Localization Supervision Aug 1, 2020 Image Captioning Language Modelling
Code Code Available 15 LIME: Less Is More for MLLM Evaluation Sep 10, 2024 Image Captioning Question Answering
Code Code Available 15 G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o Dec 18, 2024 Image Captioning Video Captioning
Code Code Available 15 How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? Nov 16, 2015 Image Captioning
Code Code Available 15 Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training Jan 4, 2024 Descriptive Image Captioning
Code Code Available 15 Hard Non-Monotonic Attention for Character-Level Transduction Aug 29, 2018 Hard Attention Image Captioning
Code Code Available 15 Contrastive Vision-Language Alignment Makes Efficient Instruction Learner Nov 29, 2023 Contrastive Learning Image Captioning
Code Code Available 15 Harnessing the Power of Large Vision Language Models for Synthetic Image Detection Apr 3, 2024 Image Captioning Synthetic Image Detection
Code Code Available 15 Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning Aug 8, 2022 Image Captioning Image Generation
Code Code Available 15 Paying Attention to Descriptions Generated by Image Captioning Models Apr 24, 2017 Image Captioning
Code Code Available 15