GPTs Are Multilingual Annotators for Sequence Generation Tasks Feb 8, 2024 Image Captioning
Code Code Available 0Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing Feb 8, 2024 Image Captioning TAG
— Unverified 0CIC: A Framework for Culturally-Aware Image Captioning Feb 8, 2024 Descriptive Image Captioning
— Unverified 0Examining Gender and Racial Bias in Large Vision-Language Models Using a Novel Dataset of Parallel Images Feb 8, 2024 Image Captioning Question Answering
Code Code Available 0Image captioning for Brazilian Portuguese using GRIT model Feb 7, 2024 Image Captioning model
— Unverified 0Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models? Feb 7, 2024 Domain Generalization Image Captioning
— Unverified 0PICS: Pipeline for Image Captioning and Search Feb 1, 2024 Asset Management Image Captioning
— Unverified 0SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling Feb 1, 2024 Diversity Image Captioning
— Unverified 0Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data Jan 31, 2024 Benchmarking Change Detection
Code Code Available 0COCO is "ALL'' You Need for Visual Instruction Fine-tuning Jan 17, 2024 All Image Captioning
— Unverified 0KTVIC: A Vietnamese Image Captioning Dataset on the Life Domain Jan 16, 2024 Image Captioning Vietnamese Image Captioning
— Unverified 0Jewelry Recognition via Encoder-Decoder Models Jan 15, 2024 Decoder Image Captioning
— Unverified 0What Else Would I Like? A User Simulator using Alternatives for Improved Evaluation of Fashion Conversational Recommendation Systems Jan 11, 2024 Conversational Recommendation Image Captioning
— Unverified 0Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding Jan 9, 2024 Image Captioning image-classification
— Unverified 0MAMI: Multi-Attentional Mutual-Information for Long Sequence Neuron Captioning Jan 5, 2024 Decoder Image Captioning
— Unverified 0Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding Jan 5, 2024 Image Captioning Machine Translation
Code Code Available 0Object-oriented backdoor attack against image captioning Jan 5, 2024 Backdoor Attack Image Captioning
— Unverified 0SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment Jan 4, 2024 Image Captioning image-classification
— Unverified 0Social Media Ready Caption Generation for Brands Jan 3, 2024 Caption Generation Image Captioning
— Unverified 0Cycle-Consistency Learning for Captioning and Grounding Dec 23, 2023 Image Captioning Visual Grounding
— Unverified 0LLM4VG: Large Language Models Evaluation for Video Grounding Dec 21, 2023 Image Captioning Video Grounding
— Unverified 0p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models Dec 17, 2023 Image Captioning Question Answering
Code Code Available 0Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis Dec 14, 2023 Image Captioning Scene Understanding
— Unverified 0Improving Cross-modal Alignment with Synthetic Pairs for Text-only Image Captioning Dec 14, 2023 cross-modal alignment Decoder
— Unverified 0Synocene, Beyond the Anthropocene: De-Anthropocentralising Human-Nature-AI Interaction Dec 13, 2023 Chatbot Image Captioning
— Unverified 0Filter & Align: Leveraging Human Knowledge to Curate Image-Text Data Dec 11, 2023 Image Captioning Image-text Retrieval
— Unverified 0Unifying Text, Tables, and Images for Multimodal Question Answering Dec 10, 2023 Image Captioning Question Answering
Code Code Available 0PixLore: A Dataset-driven Approach to Rich Image Captioning Dec 8, 2023 GPU Image Captioning
Code Code Available 0Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects Dec 8, 2023 Image Captioning object-detection
— Unverified 0User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning Dec 8, 2023 Image Captioning Language Modeling
— Unverified 0On the Robustness of Large Multimodal Models Against Image Adversarial Attacks Dec 6, 2023 Image Captioning image-classification
— Unverified 0Towards More Unified In-context Visual Understanding Dec 5, 2023 Decoder Image Captioning
— Unverified 0CLAMP: Contrastive LAnguage Model Prompt-tuning Dec 4, 2023 Contrastive Learning Image Captioning
— Unverified 0Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT Dec 3, 2023 Caption Generation Decoder
Code Code Available 0Video Summarization: Towards Entity-Aware Captions Dec 1, 2023 Image Captioning Video Captioning
Code Code Available 0Enhancing Image Captioning with Neural Models Dec 1, 2023 Caption Generation Image Captioning
— Unverified 0Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts Dec 1, 2023 Chart Question Answering Document AI
— Unverified 0InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation Nov 30, 2023 Image Captioning Referring Expression
Code Code Available 0A natural language processing-based approach: mapping human perception by understanding deep semantic features in street view images Nov 29, 2023 Image Captioning Language Modelling
— Unverified 0MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training Nov 28, 2023 Image Captioning Transfer Learning
— Unverified 0EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension Nov 27, 2023 Image Captioning Object
— Unverified 0DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism Nov 25, 2023 Caption Generation Denoising
— Unverified 0Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder Nov 15, 2023 Decoder Image Captioning
— Unverified 0Improving Image Captioning via Predicting Structured Concepts Nov 14, 2023 Image Captioning
— Unverified 0Holistic Evaluation of GPT-4V for Biomedical Imaging Nov 10, 2023 Anatomy Diagnostic
— Unverified 0How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model Nov 10, 2023 Image Captioning Language Modeling
— Unverified 0Zero-shot Translation of Attention Patterns in VQA Models to Natural Language Nov 8, 2023 Image Captioning Language Modeling
Code Code Available 0DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding Nov 7, 2023 3D Reconstruction Benchmarking
Code Code Available 0JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models Nov 7, 2023 Image Captioning
Code Code Available 0Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning Nov 2, 2023 Caption Generation Efficient Exploration
— Unverified 0