On Speculative Decoding for Multimodal Large Language Models Apr 13, 2024 Image Captioning Language Modeling
— Unverified 0FLoRA: Enhancing Vision-Language Models with Parameter-Efficient Federated Learning Apr 12, 2024 Federated Learning Image Captioning
Code Code Available 0Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation Apr 6, 2024 Image Captioning Instance Segmentation
— Unverified 0Would Deep Generative Models Amplify Bias in Future Models? Apr 4, 2024 Image Captioning Image Generation
— Unverified 0Jump Self-attention: Capturing High-order Statistics in Transformers Apr 3, 2024 Image Captioning Natural Language Understanding
— Unverified 0VLRM: Vision-Language Models act as Reward Models for Image Captioning Apr 2, 2024 Image Captioning reinforcement-learning
— Unverified 0Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning Apr 1, 2024 Image Captioning Instruction Following
Code Code Available 0LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction Apr 1, 2024 Image Captioning Instruction Following
— Unverified 0Text Data-Centric Image Captioning with Interactive Prompts Mar 28, 2024 Image Captioning
— Unverified 0LocCa: Visual Pretraining with Location-aware Captioners Mar 28, 2024 Decoder Image Captioning
— Unverified 0Semantic Map-based Generation of Navigation Instructions Mar 28, 2024 Image Captioning
Code Code Available 0A Review of Multi-Modal Large Language and Vision Models Mar 28, 2024 Image Captioning Prompt Engineering
— Unverified 0A Survey on Large Language Models from Concept to Implementation Mar 27, 2024 Chatbot Image Captioning
— Unverified 0Automated Report Generation for Lung Cytological Images Using a CNN Vision Classifier and Multiple-Transformer Text Decoders: Preliminary Study Mar 26, 2024 Decoder Image Captioning
— Unverified 0The Solution for the ICCV 2023 1st Scientific Figure Captioning Challenge Mar 26, 2024 Caption Generation Image Captioning
— Unverified 0Visual Hallucination: Definition, Quantification, and Prescriptive Remediations Mar 26, 2024 Hallucination Image Captioning
— Unverified 0Semi-Supervised Image Captioning Considering Wasserstein Graph Matching Mar 26, 2024 Data Augmentation Graph Matching
— Unverified 0Image Captioning in news report scenario Mar 24, 2024 Image Captioning Recommendation Systems
— Unverified 0Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content Mar 23, 2024 Descriptive Image Captioning
Code Code Available 0A Multimodal Approach for Cross-Domain Image Retrieval Mar 22, 2024 Image Captioning Image Retrieval
— Unverified 0MyVLM: Personalizing VLMs for User-Specific Queries Mar 21, 2024 Image Captioning Language Modelling
— Unverified 0Inserting Faces inside Captions: Image Captioning with Attention Guided Merging Mar 20, 2024 Image Captioning Retrieval
— Unverified 0Improved Baselines for Data-efficient Perceptual Augmentation of LLMs Mar 20, 2024 Audio captioning Image Captioning
— Unverified 0As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? Mar 19, 2024 Adversarial Attack Image Captioning
— Unverified 0Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition Mar 19, 2024 Dense Captioning Image Captioning
— Unverified 0Towards Multimodal In-Context Learning for Vision & Language Models Mar 19, 2024 Image Captioning In-Context Learning
— Unverified 0TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling Mar 18, 2024 Image Captioning Visual Storytelling
— Unverified 0Few-Shot VQA with Frozen LLMs: A Tale of Two Approaches Mar 17, 2024 Image Captioning Question Answering
— Unverified 0Does the Performance of Text-to-Image Retrieval Models Generalize Beyond Captions-as-a-Query? Mar 15, 2024 Descriptive Image Captioning
Code Code Available 0Leveraging LLMs for On-the-Fly Instruction Guided Image Editing Mar 12, 2024 Image Captioning
Code Code Available 0Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Mar 12, 2024 Image Captioning Image Generation
— Unverified 0A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes Mar 12, 2024 3D dense captioning Dense Captioning
— Unverified 0Transformer based Multitask Learning for Image Captioning and Object Detection Mar 10, 2024 Autonomous Navigation Image Captioning
— Unverified 0The Case for Evaluating Multimodal Translation Models on Text Datasets Mar 5, 2024 Descriptive Image Captioning
— Unverified 0What Is Missing in Multilingual Visual Reasoning and How to Fix It Mar 3, 2024 Image Captioning Visual Reasoning
Code Code Available 0Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset Mar 1, 2024 Image Captioning Image Generation
Code Code Available 0EAMA : Entity-Aware Multimodal Alignment Based Approach for News Image Captioning Feb 29, 2024 Image Captioning Sentence
— Unverified 0Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction Feb 28, 2024 Image Captioning Language Modeling
— Unverified 0ArcSin: Adaptive ranged cosine Similarity injected noise for Language-Driven Visual Tasks Feb 27, 2024 Domain Generalization Image Captioning
— Unverified 0Fine-tuning CLIP Text Encoders with Two-step Paraphrasing Feb 23, 2024 Image Captioning Image Retrieval
— Unverified 0Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions Feb 20, 2024 Image Captioning Question Answering
— Unverified 0IRR: Image Review Ranking Framework for Evaluating Vision-Language Models Feb 19, 2024 Diversity Image Captioning
— Unverified 0AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization Feb 19, 2024 Adversarial Attack Image Captioning
Code Code Available 0Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models Feb 19, 2024 Image Captioning Question Answering
— Unverified 0Cobra Effect in Reference-Free Image Captioning Metrics Feb 18, 2024 Image Captioning
Code Code Available 0Learning How To Ask: Cycle-Consistency Refines Prompts in Multimodal Foundation Models Feb 13, 2024 Code Generation HumanEval
— Unverified 0Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models Feb 13, 2024 Image Captioning Image to text
— Unverified 0Multimodal Learned Sparse Retrieval for Image Suggestion Feb 12, 2024 Image Captioning Retrieval
— Unverified 0Consistency Model is an Effective Posterior Sample Approximation for Diffusion Inverse Solvers Feb 9, 2024 Image Captioning Semantic Segmentation
— Unverified 0Large Language Models for Captioning and Retrieving Remote Sensing Images Feb 9, 2024 Cross-Modal Retrieval Decoder
— Unverified 0