Vision Language Models Can Parse Floor Plan Maps Sep 19, 2024 Image Captioning Question Answering
— Unverified 0JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images Sep 19, 2024 Hallucination Image Captioning
Code Code Available 0LLMs Can Check Their Own Results to Mitigate Hallucinations in Traffic Understanding Tasks Sep 19, 2024 Autonomous Driving Hallucination
— Unverified 0Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Sep 18, 2024 Image Captioning Large Language Model
— Unverified 0KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph Sep 17, 2024 cross-modal alignment Image Captioning
Code Code Available 0Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models Sep 16, 2024 Decoder Diversity
Code Code Available 3NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training Sep 15, 2024 Contrastive Learning cross-modal alignment
— Unverified 0Evaluating authenticity and quality of image captions via sentiment and semantic analyses Sep 14, 2024 Image Captioning Image to text
— Unverified 0Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings Sep 12, 2024 FAD Image Captioning
— Unverified 0BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding Sep 12, 2024 Contrastive Learning Image Captioning
— Unverified 0Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks Sep 11, 2024 Image Captioning Question Answering
Code Code Available 0PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation Sep 10, 2024 Image Captioning Image Generation
— Unverified 0LIME: Less Is More for MLLM Evaluation Sep 10, 2024 Image Captioning Question Answering
Code Code Available 1Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding Sep 10, 2024 Hallucination Image Captioning
— Unverified 0MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning Sep 9, 2024 Federated Learning Image Captioning
— Unverified 0Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity Sep 7, 2024 Image Captioning Image Retrieval
Code Code Available 0FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes Sep 6, 2024 Domain Adaptation Image Captioning
— Unverified 0No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning Sep 4, 2024 Image Captioning Retrieval
— Unverified 0Kvasir-VQA: A Text-Image Pair GI Tract Dataset Sep 2, 2024 Image Captioning Image Generation
Code Code Available 0MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models Aug 30, 2024 Image Captioning Language Modeling
Code Code Available 1See or Guess: Counterfactually Regularized Image Captioning Aug 29, 2024 Causal Inference counterfactual
Code Code Available 1Fluent and Accurate Image Captioning with a Self-Trained Reward Model Aug 29, 2024 Image Captioning Specificity
— Unverified 0Hand1000: Generating Realistic Hands from Text with Only 1,000 Images Aug 28, 2024 Anatomy Gesture Recognition
— Unverified 0Pixels to Prose: Understanding the art of Image Captioning Aug 28, 2024 Descriptive Image Captioning
— Unverified 0Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization Aug 26, 2024 Descriptive Image Captioning
Code Code Available 1Bidirectional Awareness Induction in Autoregressive Seq2Seq Models Aug 25, 2024 Image Captioning Machine Translation
— Unverified 0Shifted Window Fourier Transform And Retention For Image Captioning Aug 25, 2024 Autonomous Vehicles Image Captioning
— Unverified 0The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks Aug 19, 2024 Denoising Image Captioning
— Unverified 0PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology Aug 13, 2024 Image Captioning
— Unverified 0Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy and Novel Ensemble Method Aug 9, 2024 Diversity Image Captioning
— Unverified 0FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers Aug 9, 2024 Image Captioning Transfer Learning
Code Code Available 0Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs Aug 8, 2024 Articles Image Captioning
— Unverified 0One Framework to Rule Them All: Unifying Multimodal Tasks with LLM Neural-Tuning Aug 6, 2024 All Image Captioning
— Unverified 0Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection Aug 5, 2024 Descriptive Image Captioning
— Unverified 0Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI Aug 4, 2024 Image Captioning
— Unverified 0A Novel Evaluation Framework for Image2Text Generation Aug 3, 2024 Image Captioning Image Generation
— Unverified 0The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models Aug 2, 2024 Image Captioning
— Unverified 0AI Safety in Practice: Enhancing Adversarial Robustness in Multimodal Image Captioning Jul 30, 2024 Adversarial Robustness Computational Efficiency
— Unverified 0BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues Jul 29, 2024 Image Captioning
Code Code Available 1VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks Jul 29, 2024 Deep Learning Domain Generalization
— Unverified 0HICEScore: A Hierarchical Metric for Image Captioning Evaluation Jul 26, 2024 Descriptive Image Captioning
Code Code Available 0SWIFT: Semantic Watermarking for Image Forgery Thwarting Jul 26, 2024 Image Captioning
Code Code Available 0Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models Jul 23, 2024 Computational Efficiency Image Captioning
— Unverified 0DiffX: Guide Your Layout to Cross-Modal Generative Modeling Jul 22, 2024 Denoising Image Captioning
Code Code Available 1VideoGameBunny: Towards vision assistants for video games Jul 21, 2024 Image Captioning Scene Understanding
— Unverified 0Downstream-Pretext Domain Knowledge Traceback for Active Learning Jul 20, 2024 Active Learning Diversity
— Unverified 0EVLM: An Efficient Vision-Language Model for Visual Understanding Jul 19, 2024 Image Captioning Language Modeling
— Unverified 0LookupViT: Compressing visual information to a limited number of tokens Jul 17, 2024 Image Captioning image-classification
— Unverified 0CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation Jul 16, 2024 controllable image captioning Data Augmentation
Code Code Available 0Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights Jul 16, 2024 Image Captioning Multimodal Reasoning
Code Code Available 0