Generating Triples with Adversarial Networks for Scene Graph Construction Feb 7, 2018 Attribute graph construction
— Unverified 0Generating Rationales in Visual Question Answering Apr 4, 2020 Question Answering Visual Question Answering
— Unverified 0Assessing Visual Quality of Omnidirectional Videos Jul 14, 2019 Visual Question Answering (VQA)
— Unverified 0Language Models are General-Purpose Interfaces Jun 13, 2022 Causal Language Modeling Few-Shot Learning
— Unverified 0Ontology-based knowledge representation for bone disease diagnosis: a foundation for safe and sustainable medical artificial intelligence systems Jun 5, 2025 Diagnostic Multimodal Deep Learning
— Unverified 0Generating Natural Questions from Images for Multimodal Assistants Nov 17, 2020 Common Sense Reasoning Natural Questions
— Unverified 0LAPDoc: Layout-Aware Prompting for Documents Feb 15, 2024 document understanding Key Information Extraction
— Unverified 0Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention Feb 15, 2019 Explanation Generation Language Modeling
— Unverified 0COIN: Counterfactual Image Generation for VQA Interpretation Jan 10, 2022 counterfactual Image Generation
— Unverified 0DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor May 6, 2025 Mamba Video Quality Assessment
— Unverified 0Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge May 30, 2023 Answer Selection Question Answering
— Unverified 0Assessing the Robustness of Visual Question Answering Models Nov 30, 2019 Question Answering Visual Question Answering
— Unverified 0Generalized Hadamard-Product Fusion Operators for Visual Question Answering Mar 26, 2018 Neural Architecture Search Question Answering
— Unverified 0Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems Oct 26, 2022 Question Answering Visual Question Answering
— Unverified 0Directional Gradient Projection for Robust Fine-Tuning of Foundation Models Feb 21, 2025 image-classification Image Classification
— Unverified 0Latent Image and Video Resolution Prediction using Convolutional Neural Networks Oct 17, 2024 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Latent Variable Models for Visual Question Answering Jan 16, 2021 Benchmarking Question Answering
— Unverified 0Assessing Image Quality Issues for Real-World Problems Mar 27, 2020 Image Captioning Question Answering
— Unverified 0Aligned Dual Channel Graph Convolutional Network for Visual Question Answering Jul 1, 2020 Question Answering Visual Question Answering
— Unverified 0LAVIS: A Library for Language-Vision Intelligence Sep 15, 2022 Benchmarking Image Captioning
— Unverified 0Gender and Racial Bias in Visual Question Answering Datasets May 17, 2022 Question Answering Visual Question Answering
— Unverified 0Gemini Pro Defeated by GPT-4V: Evidence from Education Dec 27, 2023 image-classification Image Classification
— Unverified 0COCO is "ALL'' You Need for Visual Instruction Fine-tuning Jan 17, 2024 All Image Captioning
— Unverified 0Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources Nov 22, 2015 Form General Knowledge
— Unverified 0GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning Jun 22, 2025 Answer Generation Decision Making
— Unverified 0LEAF-QA: Locate, Encode & Attend for Figure Question Answering Jul 30, 2019 Chart Question Answering Question Answering
— Unverified 0GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis Nov 25, 2024 Medical Visual Question Answering Multiple-choice
— Unverified 0GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance May 25, 2025 Caption Generation Question Answering
— Unverified 0Making Video Quality Assessment Models Robust to Bit Depth Apr 25, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Learning by Asking Questions Dec 4, 2017 Question Answering Visual Question Answering
— Unverified 0Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision Oct 24, 2022 cross-modal alignment Cross-Modal Retrieval
— Unverified 0Learning Compositional Representation for Few-shot Visual Question Answering Feb 21, 2021 Attribute Question Answering
— Unverified 0Asking questions on handwritten document collections Oct 2, 2021 Optical Character Recognition (OCR) Question Answering
— Unverified 0CoBIT: A Contrastive Bi-directional Image-Text Generation Model Mar 23, 2023 Decoder Image Generation
— Unverified 0Binding Touch to Everything: Learning Unified Multimodal Tactile Representations Jan 31, 2024 Question Answering Visual Question Answering (VQA)
— Unverified 0FVQA: Fact-based Visual Question Answering Jun 17, 2016 Common Sense Reasoning Question Answering
— Unverified 0FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering Mar 19, 2023 Common Sense Reasoning Information Retrieval
— Unverified 0Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering Apr 16, 2016 General Classification Human-Object Interaction Detection
— Unverified 0Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues Mar 1, 2021 Question Answering Visual Question Answering
— Unverified 0Document AI: Benchmarks, Models and Applications Nov 16, 2021 Deep Learning Document AI
— Unverified 0Making the V in Text-VQA Matter Aug 1, 2023 Optical Character Recognition (OCR) TextVQA
— Unverified 0Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering Apr 24, 2024 Language Modeling Language Modelling
— Unverified 0Learning Sparse Mixture of Experts for Visual Question Answering Sep 19, 2019 Mixture-of-Experts Question Answering
— Unverified 0Learning to Answer Multilingual and Code-Mixed Questions Nov 14, 2022 AI Agent Question Answering
— Unverified 0Fusion of Detected Objects in Text for Visual Question Answering Aug 14, 2019 Question Answering Visual Commonsense Reasoning
— Unverified 0Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment Aug 1, 2023 Diversity Knowledge Distillation
— Unverified 0FunBench: Benchmarking Fundus Reading Skills of MLLMs Mar 2, 2025 Anatomy Benchmarking
— Unverified 0Asking More Informative Questions for Grounded Retrieval Nov 14, 2023 Question Answering Question Selection
— Unverified 0MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering Mar 24, 2025 Graph Neural Network Question Answering
— Unverified 0Making Video Quality Assessment Models Sensitive to Frame Rate Distortions May 21, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0