Enhanced Knowledge Injection for Radiology Report Generation Nov 1, 2023 Image Captioning Retrieval
— Unverified 0What a Whole Slide Image Can Tell? Subtype-guided Masked Transformer for Pathological Image Captioning Oct 31, 2023 Image Captioning Sentence
— Unverified 0Improving Medical Visual Representations via Radiology Report Generation Oct 30, 2023 Contrastive Learning Decoder
— Unverified 0Women Wearing Lipstick: Measuring the Bias Between an Object and Its Related Gender Oct 29, 2023 Image Captioning
Code Code Available 0Impressions: Understanding Visual Semiotics and Aesthetic Impact Oct 27, 2023 Image Captioning Image Description
— Unverified 0CropCap: Embedding Visual Cross-Partition Dependency for Image Captioning Oct 27, 2023 Image Captioning
— Unverified 0Apollo: Zero-shot MultiModal Reasoning with Multiple Experts Oct 25, 2023 Image Captioning Multimodal Reasoning
Code Code Available 0A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Oct 25, 2023 Image Captioning Image Generation
— Unverified 0Semantic and Expressive Variation in Image Captions Across Languages Oct 22, 2023 Diversity Graph Embedding
— Unverified 0RSAdapter: Adapting Multimodal Models for Remote Sensing Visual Question Answering Oct 19, 2023 Image Captioning Question Answering
Code Code Available 0ICU: Conquering Language Barriers in Vision-and-Language Modeling by Dividing the Tasks into Image Captioning and Language Understanding Oct 19, 2023 Image Captioning Language Modeling
Code Code Available 0Lost in Translation: When GPT-4V(ision) Can't See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond Oct 19, 2023 Image Captioning Language Modeling
— Unverified 0CLAIR: Evaluating Image Captions with Large Language Models Oct 19, 2023 Diversity Image Captioning
— Unverified 0Evaluating the Fairness of Discriminative Foundation Models in Computer Vision Oct 18, 2023 Fairness Image Captioning
Code Code Available 0Towards Automatic Satellite Images Captions Generation Using Large Language Models Oct 17, 2023 Image Captioning Management
— Unverified 0Bounding and Filling: A Fast and Flexible Framework for Image Captioning Oct 15, 2023 Image Captioning Image Description
Code Code Available 0Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning Oct 12, 2023 Image Captioning Image-text Retrieval
— Unverified 0A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation Oct 11, 2023 Caption Generation Decoder
— Unverified 0Improving mitosis detection on histopathology images using large vision-language models Oct 11, 2023 Domain Generalization Image Captioning
— Unverified 0LangNav: Language as a Perceptual Representation for Navigation Oct 11, 2023 Image Captioning Language Modeling
— Unverified 0The Solution for the CVPR2023 NICE Image Captioning Challenge Oct 10, 2023 Contrastive Learning Image Captioning
— Unverified 0ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models Oct 9, 2023 Image Captioning Visual Commonsense Reasoning
— Unverified 0Lightweight In-Context Tuning for Multimodal Unified Models Oct 8, 2023 Image Captioning In-Context Learning
— Unverified 0Module-wise Adaptive Distillation for Multimodality Foundation Models Oct 6, 2023 Image Captioning Thompson Sampling
— Unverified 0IcoCap: Improving Video Captioning by Compounding Images Oct 5, 2023 Image Captioning Video Captioning
— Unverified 0On the Performance of Multimodal Language Models Oct 4, 2023 Benchmarking Binary Classification
— Unverified 0Language Models as Knowledge Bases for Visual Word Sense Disambiguation Oct 3, 2023 Image Captioning Multiple-choice
Code Code Available 0Self-Supervised Open-Ended Classification with Small Visual Language Models Sep 30, 2023 Few-Shot Learning Image Captioning
— Unverified 0ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens Sep 28, 2023 Cross-Modal Retrieval GPU
Code Code Available 0Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness Sep 27, 2023 Data Augmentation Image Captioning
— Unverified 0BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning Sep 26, 2023 Image Captioning Transfer Learning
Code Code Available 0Aligning Large Multimodal Models with Factually Augmented RLHF Sep 25, 2023 Hallucination Image Captioning
— Unverified 0FaceGemma: Enhancing Image Captioning with Facial Attributes for Portrait Images Sep 24, 2023 Attribute Caption Generation
— Unverified 0iPIC-XAI: Improving PIC-XAI for Enhanced Image Captioning Explanation Sep 23, 2023 Image Captioning TAG
Code Code Available 0Contextual Emotion Estimation from Image Captions Sep 22, 2023 Image Captioning Language Modelling
— Unverified 0Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis Sep 21, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 0Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning Sep 20, 2023 Audio captioning Caption Generation
— Unverified 0Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning Sep 10, 2023 Denoising Diversity
— Unverified 0Towards Better Multi-modal Keyphrase Generation via Visual Entity Enhancement and Multi-granularity Image Noise Filtering Sep 9, 2023 Image Captioning Image-text matching
Code Code Available 0Physically Grounded Vision-Language Models for Robotic Manipulation Sep 5, 2023 Image Captioning Language Modelling
— Unverified 0NICE: CVPR 2023 Challenge on Zero-shot Image Captioning Sep 5, 2023 Fairness Image Captioning
— Unverified 0RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model Sep 3, 2023 Decision Making Image Captioning
— Unverified 0Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding Sep 1, 2023 Graph Generation Image Captioning
Code Code Available 0Finding-Aware Anatomical Tokens for Chest X-Ray Automated Reporting Aug 30, 2023 Image Captioning Language Modelling
— Unverified 0Can Prompt Learning Benefit Radiology Report Generation? Aug 30, 2023 Image Captioning Prompt Engineering
— Unverified 0Towards Real Time Egocentric Segment Captioning for The Blind and Visually Impaired in RGB-D Theatre Images Aug 26, 2023 Autonomous Driving Image Captioning
— Unverified 0DLIP: Distilling Language-Image Pre-training Aug 24, 2023 Image Captioning Image-text Retrieval
— Unverified 0Explore and Tell: Embodied Visual Captioning in 3D Environments Aug 21, 2023 Image Captioning Navigate
— Unverified 0Generic Attention-model Explainability by Weighted Relevance Accumulation Aug 20, 2023 Image Captioning Question Answering
— Unverified 0Visually-Aware Context Modeling for News Image Captioning Aug 16, 2023 Articles Image Captioning
Code Code Available 0