BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning Sep 26, 2023 Image Captioning Transfer Learning
Code Code Available 0Aligning Large Multimodal Models with Factually Augmented RLHF Sep 25, 2023 Hallucination Image Captioning
— Unverified 0FaceGemma: Enhancing Image Captioning with Facial Attributes for Portrait Images Sep 24, 2023 Attribute Caption Generation
— Unverified 0iPIC-XAI: Improving PIC-XAI for Enhanced Image Captioning Explanation Sep 23, 2023 Image Captioning TAG
Code Code Available 0Contextual Emotion Estimation from Image Captions Sep 22, 2023 Image Captioning Language Modelling
— Unverified 0Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis Sep 21, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 0Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning Sep 20, 2023 Audio captioning Caption Generation
— Unverified 0Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation Sep 12, 2023 Image Captioning Image Generation
Code Code Available 1Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning Sep 10, 2023 Denoising Diversity
— Unverified 0Towards Better Multi-modal Keyphrase Generation via Visual Entity Enhancement and Multi-granularity Image Noise Filtering Sep 9, 2023 Image Captioning Image-text matching
Code Code Available 0Physically Grounded Vision-Language Models for Robotic Manipulation Sep 5, 2023 Image Captioning Language Modelling
— Unverified 0NICE: CVPR 2023 Challenge on Zero-shot Image Captioning Sep 5, 2023 Fairness Image Captioning
— Unverified 0Exchanging-based Multimodal Fusion with Transformer Sep 5, 2023 Image Captioning Image Generation
Code Code Available 1RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model Sep 3, 2023 Decision Making Image Captioning
— Unverified 0Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding Sep 1, 2023 Graph Generation Image Captioning
Code Code Available 0Finding-Aware Anatomical Tokens for Chest X-Ray Automated Reporting Aug 30, 2023 Image Captioning Language Modelling
— Unverified 0Can Prompt Learning Benefit Radiology Report Generation? Aug 30, 2023 Image Captioning Prompt Engineering
— Unverified 0CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation Aug 29, 2023 Image Captioning Machine Translation
Code Code Available 1Towards Real Time Egocentric Segment Captioning for The Blind and Visually Impaired in RGB-D Theatre Images Aug 26, 2023 Autonomous Driving Image Captioning
— Unverified 0MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning Aug 25, 2023 Image Captioning Video Captioning
Code Code Available 1VIGC: Visual Instruction Generation and Correction Aug 24, 2023 Hallucination Image Captioning
Code Code Available 1Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond Aug 24, 2023 Chart Question Answering FS-MEVQA
Code Code Available 5DLIP: Distilling Language-Image Pre-training Aug 24, 2023 Image Captioning Image-text Retrieval
— Unverified 0CgT-GAN: CLIP-guided Text GAN for Image Captioning Aug 23, 2023 Image Captioning
Code Code Available 1With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning Aug 23, 2023 Decoder Image Captioning
Code Code Available 1Explore and Tell: Embodied Visual Captioning in 3D Environments Aug 21, 2023 Image Captioning Navigate
— Unverified 0Generic Attention-model Explainability by Weighted Relevance Accumulation Aug 20, 2023 Image Captioning Question Answering
— Unverified 0VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control Aug 18, 2023 Image Captioning Text Generation
Code Code Available 1Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection Aug 16, 2023 Image Captioning Language Modeling
Code Code Available 1Visually-Aware Context Modeling for News Image Captioning Aug 16, 2023 Articles Image Captioning
Code Code Available 0GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text Aug 14, 2023 Drug Discovery Image Captioning
Code Code Available 1UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity Aug 14, 2023 All Brain Decoding
— Unverified 0Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage Aug 14, 2023 Image Captioning Retrieval
Code Code Available 0Informative Scene Graph Generation via Debiasing Aug 10, 2023 Blocking Graph Generation
— Unverified 0IIHT: Medical Report Generation with Image-to-Indicator Hierarchical Transformer Aug 10, 2023 Image Captioning Machine Translation
— Unverified 0Asynchronous Evolution of Deep Neural Network Architectures Aug 8, 2023 Evolutionary Algorithms Image Captioning
— Unverified 0Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions Aug 8, 2023 Caption Generation Image Captioning
Code Code Available 2Building Safe and Reliable AI systems for Safety Critical Tasks with Vision-Language Processing Aug 6, 2023 Image Captioning Out of Distribution (OOD) Detection
— Unverified 0A Comprehensive Analysis of Real-World Image Captioning and Scene Identification Aug 5, 2023 Descriptive Image Captioning
— Unverified 0Improving Generalization of Image Captioning with Unsupervised Prompt Learning Aug 5, 2023 Attribute Image Captioning
— Unverified 0Multimodal Neurons in Pretrained Text-Only Transformers Aug 3, 2023 Image Captioning Image to text
— Unverified 0Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model Aug 2, 2023 Hallucination Image Captioning
Code Code Available 1TS-RGBD Dataset: a Novel Dataset for Theatre Scenes Description for People with Visual Impairments Aug 2, 2023 Action Recognition Image Captioning
Code Code Available 0ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora Aug 2, 2023 Contrastive Learning Diversity
Code Code Available 0Guiding Image Captioning Models Toward More Specific Captions Jul 31, 2023 Image Captioning Image Retrieval
— Unverified 0Transferable Decoding with Visual Entities for Zero-Shot Image Captioning Jul 31, 2023 Caption Generation Hallucination
Code Code Available 1Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences Jul 31, 2023 Decoder Image Captioning
— Unverified 0RSGPT: A Remote Sensing Vision Language Model and Benchmark Jul 28, 2023 Image Captioning Language Modeling
Code Code Available 1Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation Jul 27, 2023 Image Captioning Model Optimization
Code Code Available 0Causal reasoning in typical computer vision tasks Jul 26, 2023 Autonomous Driving Deep Learning
— Unverified 0