M^4I: Multi-modal Models Membership Inference Sep 15, 2022 Image Captioning Inference Attack
Code Code Available 15 MAGVLT: Masked Generative Vision-and-Language Transformer Mar 21, 2023 Image Captioning Image Generation
Code Code Available 15 Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA May 13, 2020 Image Captioning Multi-Label Classification
Code Code Available 15 Disentangled Pre-training for Human-Object Interaction Detection Apr 2, 2024 Action Recognition Decoder
Code Code Available 15 DeltaNet:Conditional Medical Report Generation for COVID-19 Diagnosis Nov 12, 2022 COVID-19 Diagnosis Decoder
Code Code Available 15 Self-supervised Learning from a Multi-view Perspective Jun 10, 2020 Image Captioning Language Modelling
Code Code Available 15 Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning Mar 14, 2019 Diversity Image Captioning
Code Code Available 15 Dense Relational Image Captioning via Multi-task Triple-Stream Networks Oct 8, 2020 Graph Generation Image Captioning
Code Code Available 15 Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach Aug 10, 2020 Attribute Image Captioning
Code Code Available 15 Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training Jan 4, 2024 Descriptive Image Captioning
Code Code Available 15 CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features May 13, 2019 Domain Generalization Image Captioning
Code Code Available 15 Detecting and Recovering Sequential DeepFake Manipulation Jul 5, 2022 DeepFake Detection Face Swapping
Code Code Available 15 Brain Captioning: Decoding human brain activity into images and text May 19, 2023 Brain Decoding Depth Estimation
Code Code Available 15 CgT-GAN: CLIP-guided Text GAN for Image Captioning Aug 23, 2023 Image Captioning
Code Code Available 15 A large annotated corpus for learning natural language inference Aug 21, 2015 Image Captioning Natural Language Inference
Code Code Available 15 DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 15 Discovering Non-monotonic Autoregressive Orderings with Variational Inference Oct 27, 2021 Decoder Image Captioning
Code Code Available 15 Discovering Autoregressive Orderings with Variational Inference Jan 1, 2021 Code Generation Image Captioning
Code Code Available 15 ConvNet Architecture Search for Spatiotemporal Feature Learning Aug 16, 2017 Action Classification Action Recognition
Code Code Available 15 Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering Jun 16, 2023 Image Captioning Question Answering
Code Code Available 15 ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models Feb 17, 2024 Earth Observation Image Captioning
Code Code Available 15 Kosmos-2: Grounding Multimodal Large Language Models to the World Jun 26, 2023 Image Captioning In-Context Learning
Code Code Available 15 Learning Distinct and Representative Styles for Image Captioning Sep 17, 2022 Diversity Image Captioning
Code Code Available 15 Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models Mar 26, 2020 Diversity Image Captioning
Code Code Available 15 Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Dec 15, 2023 Factual Inconsistency Detection in Chart Captioning Image Captioning
Code Code Available 15 FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing May 27, 2023 Graph Similarity Human Judgment Correlation
Code Code Available 15 CIDEr: Consensus-based Image Description Evaluation Nov 20, 2014 Action Recognition Attribute
Code Code Available 15 Adapting Grad-CAM for Embedding Networks Jan 17, 2020 Image Captioning image-classification
Code Code Available 15 Dual-Level Collaborative Transformer for Image Captioning Jan 16, 2021 Descriptive Image Captioning
Code Code Available 15 Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles Mar 8, 2021 Articles Diagnostic
Code Code Available 15 Nearest Neighbor Normalization Improves Multimodal Retrieval Oct 31, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Neighborhood Contrastive Transformer for Change Captioning Mar 6, 2023 Decoder Image Captioning
Code Code Available 15 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Jul 25, 2017 Image Captioning Visual Question Answering
Code Code Available 15 CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning Oct 10, 2022 Decoder Denoising
Code Code Available 15 A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions Dec 14, 2023 Image Captioning
Code Code Available 15 In Defense of Grid Features for Visual Question Answering Jan 10, 2020 Image Captioning Question Answering
Code Code Available 15 Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning Dec 2, 2023 Causal Language Modeling Contrastive Learning
Code Code Available 15 Boostlet.js: Image processing plugins for the web via JavaScript injection May 13, 2024 Data Visualization Image Captioning
Code Code Available 15 On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective Dec 24, 2022 Decision Making Image Captioning
Code Code Available 15 CNN+CNN: Convolutional Decoders for Image Captioning May 23, 2018 Image Captioning Sentence
Code Code Available 15 Consensus-Aware Visual-Semantic Embedding for Image-Text Matching Jul 17, 2020 Image Captioning Image-text matching
Code Code Available 15 InfMLLM: A Unified Framework for Visual-Language Tasks Nov 12, 2023 GPU Image Captioning
Code Code Available 15 Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 15 End-to-End Transformer Based Model for Image Captioning Mar 29, 2022 Decoder Image Captioning
Code Code Available 15 CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 15 End-to-End Supermask Pruning: Learning to Prune Image Captioning Models Oct 7, 2021 Decoder Image Captioning
Code Code Available 15 Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 15 COCO-Stuff: Thing and Stuff Classes in Context Dec 12, 2016 Image Captioning Semantic Segmentation
Code Code Available 15 Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts Apr 12, 2024 Image Captioning Question Answering
Code Code Available 15 Confidence-aware Non-repetitive Multimodal Transformers for TextCaps Dec 7, 2020 Image Captioning Optical Character Recognition
Code Code Available 15