Learning to Generate Grounded Visual Captions without Localization Supervision Aug 1, 2020 Image Captioning Language Modelling
Code Code Available 15 Bi-LORA: A Vision-Language Approach for Synthetic Image Detection Apr 2, 2024 Binary Classification Image Captioning
Code Code Available 15 RSGPT: A Remote Sensing Vision Language Model and Benchmark Jul 28, 2023 Image Captioning Language Modeling
Code Code Available 15 ConTEXTual Net: A Multimodal Vision-Language Model for Segmentation of Pneumothorax Mar 2, 2023 Descriptive Image Captioning
Code Code Available 15 It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection Apr 15, 2022 Image Captioning
Code Code Available 15 Injecting Semantic Concepts into End-to-End Image Captioning Dec 9, 2021 Caption Generation Image Captioning
Code Code Available 15 InfMLLM: A Unified Framework for Visual-Language Tasks Nov 12, 2023 GPU Image Captioning
Code Code Available 15 Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering Jun 16, 2023 Image Captioning Question Answering
Code Code Available 15 Kosmos-2: Grounding Multimodal Large Language Models to the World Jun 26, 2023 Image Captioning In-Context Learning
Code Code Available 15 Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network Dec 13, 2020 Caption Generation Decoder
Code Code Available 15 Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 15 Improving Image Captioning with Better Use of Captions Jun 21, 2020 Caption Generation Image Captioning
Code Code Available 15 Image Captioning with Sparse Recurrent Neural Network Aug 28, 2019 Image Captioning Text Generation
Code Code Available 15 CaMEL: Mean Teacher Learning for Image Captioning Feb 21, 2022 Image Captioning Knowledge Distillation
Code Code Available 15 Image Captioning through Image Transformer Apr 29, 2020 Image Captioning object-detection
Code Code Available 15 Image Captions are Natural Prompts for Text-to-Image Models Jul 17, 2023 Image Captioning Image Generation
Code Code Available 15 In Defense of Grid Features for Visual Question Answering Jan 10, 2020 Image Captioning Question Answering
Code Code Available 15 LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation Nov 25, 2024 Image Captioning RAG
Code Code Available 15 I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision Nov 17, 2022 Image Captioning Question Answering
Code Code Available 15 IC3: Image Captioning by Committee Consensus Feb 2, 2023 Image Captioning
Code Code Available 15 Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models Dec 15, 2023 Image Captioning In-Context Learning
Code Code Available 15 Harnessing the Power of Large Vision Language Models for Synthetic Image Detection Apr 3, 2024 Image Captioning Synthetic Image Detection
Code Code Available 15 Compact Bidirectional Transformer for Image Captioning Jan 6, 2022 Decoder Image Captioning
Code Code Available 15 How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? Nov 16, 2015 Image Captioning
Code Code Available 15 IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning Sep 26, 2024 Image Captioning Retrieval
Code Code Available 15 Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models May 15, 2023 3D Object Detection Image Captioning
Code Code Available 15 CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 15 Graph Optimal Transport for Cross-Domain Alignment Jun 26, 2020 Graph Matching Image Captioning
Code Code Available 15 COBRA: Contrastive Bi-Modal Representation Algorithm May 7, 2020 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Feb 17, 2021 Caption Generation Diversity
Code Code Available 15 Hard Non-Monotonic Attention for Character-Level Transduction Aug 29, 2018 Hard Attention Image Captioning
Code Code Available 15 A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Oct 16, 2021 Image Captioning Language Modeling
Code Code Available 15 Concadia: Towards Image-Based Text Generation with a Purpose Apr 16, 2021 Image Captioning Image to text
Code Code Available 15 Human-like Controllable Image Captioning with Verb-specific Semantic Roles Mar 22, 2021 Caption Generation controllable image captioning
Code Code Available 15 Comprehensive Image Captioning via Scene Graph Decomposition Jul 23, 2020 Diversity Image Captioning
Code Code Available 15 COCO-Stuff: Thing and Stuff Classes in Context Dec 12, 2016 Image Captioning Semantic Segmentation
Code Code Available 15 GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features Jul 20, 2022 Image Captioning
Code Code Available 15 Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift Dec 15, 2022 Benchmarking Image Captioning
Code Code Available 15 ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning Feb 11, 2022 Image Captioning Relation
Code Code Available 15 Can Audio Captions Be Evaluated with Image Caption Metrics? Oct 10, 2021 AudioCaps Audio captioning
Code Code Available 15 BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues Jul 29, 2024 Image Captioning
Code Code Available 15 Are scene graphs good enough to improve Image Captioning? Sep 25, 2020 Decoder Graph Attention
Code Code Available 15 Connecting What to Say With Where to Look by Modeling Human Attention Traces May 12, 2021 Caption Generation Image Captioning
Code Code Available 15 ImageNet3D: Towards General-Purpose Object-Level 3D Understanding Jun 13, 2024 Image Captioning Linear Probing Object-Level 3D Awareness
Code Code Available 15 Can We Talk Models Into Seeing the World Differently? Mar 14, 2024 Image Captioning Image Classification
Code Code Available 15 Brain Captioning: Decoding human brain activity into images and text May 19, 2023 Brain Decoding Depth Estimation
Code Code Available 15 CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages Oct 20, 2023 Diversity GPU
Code Code Available 15 InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation May 10, 2023 Benchmarking Image Captioning
Code Code Available 15 Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone Jun 15, 2022 Described Object Detection Image Captioning
Code Code Available 15 GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text Aug 14, 2023 Drug Discovery Image Captioning
Code Code Available 15