ILLUME: Rationalizing Vision-Language Models through Human Interactions Aug 17, 2022 Image Captioning Question Answering
Code Code Available 0Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning Aug 13, 2022 Image Captioning
Code Code Available 1Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2 Aug 9, 2022 Attribute Image Captioning
— Unverified 0Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning Aug 8, 2022 Image Captioning Image Generation
Code Code Available 1Distinctive Image Captioning via CLIP Guided Group Optimization Aug 8, 2022 Image Captioning
— Unverified 0RadTex: Learning Efficient Radiograph Representations from Text Reports Aug 5, 2022 Classification Decoder
— Unverified 0Prompt Tuning for Generative Multimodal Pretrained Models Aug 4, 2022 Image Captioning Visual Entailment
— Unverified 0Neuro-Symbolic Learning: Principles and Applications in Ophthalmology Jul 31, 2022 Common Sense Reasoning Image Captioning
— Unverified 0Retrieval-Augmented Transformer for Image Captioning Jul 26, 2022 Image Captioning Retrieval
— Unverified 0Zero-Shot Video Captioning with Evolving Pseudo-Tokens Jul 22, 2022 Image Captioning Image-text matching
Code Code Available 1Rethinking the Reference-based Distinctive Image Captioning Jul 22, 2022 Attribute Benchmarking
Code Code Available 0Efficient Modeling of Future Context for Image Captioning Jul 22, 2022 Image Captioning Sentence
Code Code Available 0GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features Jul 20, 2022 Image Captioning
Code Code Available 1Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation Jul 16, 2022 Graph Generation Image Captioning
Code Code Available 1LineCap: Line Charts for Data Visualization Captioning Models Jul 15, 2022 Data Visualization Deep Learning
Code Code Available 0A Baseline for Detecting Out-of-Distribution Examples in Image Captioning Jul 12, 2022 Image Captioning Out of Distribution (OOD) Detection
— Unverified 0Adaptive Fine-Grained Predicates Learning for Scene Graph Generation Jul 11, 2022 Fine-Grained Image Classification Graph Generation
— Unverified 0Predicting Word Learning in Children from the Performance of Computer Vision Systems Jul 7, 2022 Image Captioning
— Unverified 0Exploring the sequence length bottleneck in the Transformer for Image Captioning Jul 7, 2022 Image Captioning
Code Code Available 0Detecting and Recovering Sequential DeepFake Manipulation Jul 5, 2022 DeepFake Detection Face Swapping
Code Code Available 1Are metrics measuring what they should? An evaluation of image captioning task metrics Jul 4, 2022 Image Captioning
— Unverified 0MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities Jul 1, 2022 Image Captioning
Code Code Available 0American == White in Multimodal Language-and-Image AI Jul 1, 2022 Image Captioning Question Answering
— Unverified 0ZoDIAC: Zoneout Dropout Injection Attention Calculation Jun 28, 2022 Image Captioning image-classification
Code Code Available 0Competence-based Multimodal Curriculum Learning for Medical Report Generation Jun 24, 2022 Image Captioning Medical Report Generation
— Unverified 0DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection Jun 20, 2022 Image Captioning Image Generation
— Unverified 0What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs Jun 19, 2022 Benchmarking Image Captioning
Code Code Available 1A Self-Guided Framework for Radiology Report Generation Jun 19, 2022 Image Captioning Medical Report Generation
— Unverified 00/1 Deep Neural Networks via Block Coordinate Descent Jun 19, 2022 10-shot image generation
— Unverified 0Image Captioning based on Feature Refinement and Reflective Decoding Jun 16, 2022 Decoder Image Captioning
— Unverified 0A Unified Sequence Interface for Vision Tasks Jun 15, 2022 Image Captioning Instance Segmentation
— Unverified 0Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone Jun 15, 2022 Described Object Detection Image Captioning
Code Code Available 1Measuring Representational Harms in Image Captioning Jun 14, 2022 Fairness Image Captioning
— Unverified 0Comprehending and Ordering Semantics for Image Captioning Jun 14, 2022 Cross-Modal Retrieval Image Captioning
Code Code Available 2Language Models are General-Purpose Interfaces Jun 13, 2022 Causal Language Modeling Few-Shot Learning
— Unverified 0GLIPv2: Unifying Localization and Vision-Language Understanding Jun 12, 2022 2D Object Detection Contrastive Learning
Code Code Available 4Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs Jun 9, 2022 Image Captioning Image Classification
Code Code Available 2Intra-agent speech permits zero-shot task acquisition Jun 7, 2022 Image Captioning
— Unverified 0Improving Image Captioning with Control Signal of Sentence Quality Jun 7, 2022 Image Captioning Sentence
— Unverified 0Examining the Effects of Language-and-Vision Data Augmentation for Generation of Descriptions of Human Faces Jun 1, 2022 Caption Generation Data Augmentation
— Unverified 0Visual Transformer for Object Detection Jun 1, 2022 Image Captioning Machine Translation
— Unverified 0Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning May 31, 2022 Common Sense Reasoning Graph Generation
Code Code Available 1BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset May 28, 2022 Image Captioning Machine Translation
Code Code Available 0Variational Transformer: A Framework Beyond the Trade-off between Accuracy and Diversity for Image Captioning May 28, 2022 Diversity Image Captioning
Code Code Available 0GIT: A Generative Image-to-text Transformer for Vision and Language May 27, 2022 Decoder Image Captioning
Code Code Available 2Prompt-based Learning for Unpaired Image Captioning May 26, 2022 Image Captioning Image-text Retrieval
— Unverified 0Fine-grained Image Captioning with CLIP Reward May 26, 2022 Caption Generation Descriptive
Code Code Available 2Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset May 25, 2022 Image Captioning Image Retrieval
— Unverified 0Mutual Information Divergence: A Unified Metric for Multimodal Generative Models May 25, 2022 Hallucination Pair-wise Detection (1-ref) Hallucination Pair-wise Detection (4-ref)
Code Code Available 1Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization May 24, 2022 Image Captioning Out-of-Distribution Generalization
— Unverified 0