Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning Oct 21, 2019 Data Augmentation Decision Making
— Unverified 0ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention Nov 1, 2020 Multiple-choice Question Answering
— Unverified 0Astrea: A MOE-based Visual Understanding Model with Progressive Alignment Mar 12, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models Nov 8, 2024 Quantization Question Answering
— Unverified 0Deep Equilibrium Multimodal Fusion Jun 29, 2023 Visual Question Answering (VQA)
— Unverified 0Iterated learning for emergent systematicity in VQA May 3, 2021 Question Answering Systematic Generalization
— Unverified 0It Takes Two to Tango: Towards Theory of AI's Mind Apr 3, 2017 Attribute Question Answering
— Unverified 0iVQA: Inverse Visual Question Answering Oct 10, 2017 Question Answering Question Generation
— Unverified 0Jaeger: A Concatenation-Based Multi-Transformer VQA Model Oct 11, 2023 Dimensionality Reduction model
— Unverified 0Goal-Oriented Semantic Communication for Wireless Visual Question Answering Nov 3, 2024 Edge-computing Question Answering
— Unverified 0Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning Jan 9, 2025 Benchmarking Question Answering
— Unverified 0Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design May 22, 2023 image-classification Image Classification
— Unverified 0ComicsPAP: understanding comic strips by picking the correct panel Mar 11, 2025 Image Captioning Visual Question Answering (VQA)
— Unverified 0Jointly Learning Truth-Conditional Denotations and Groundings using Parallel Attention Apr 14, 2021 Question Answering Visual Question Answering
— Unverified 0GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing Mar 16, 2025 Change Detection Image Captioning
— Unverified 0Geometry-Aware Video Quality Assessment for Dynamic Digital Human Oct 24, 2023 Attribute Video Quality Assessment
— Unverified 0JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems Jan 1, 2025 Question Answering Visual Question Answering
— Unverified 0Evaluating and Improving Interactions with Hazy Oracles Oct 19, 2021 Object Tracking Referring Expression
— Unverified 0Assisting Scene Graph Generation with Self-Supervision Aug 8, 2020 Graph Generation Image Captioning
— Unverified 0Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling Aug 20, 2021 Data Ablation Optical Character Recognition
— Unverified 0Generative Visual Question Answering Jul 18, 2023 Generative Visual Question Answering Question Answering
— Unverified 0KAT: A Knowledge Augmented Transformer for Vision-and-Language Jan 16, 2022 Answer Generation Decoder
— Unverified 0Kernel Pooling for Convolutional Neural Networks Jul 1, 2017 Face Recognition Fine-Grained Visual Categorization
— Unverified 0DePlot: One-shot visual language reasoning by plot-to-table translation Dec 20, 2022 Chart Question Answering Factual Inconsistency Detection in Chart Captioning
— Unverified 0Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering Jan 22, 2025 Knowledge Graphs Question Answering
— Unverified 0Knowing Where to Look? Analysis on Attention of Visual Question Answering System Oct 9, 2018 Question Answering Visual Question Answering
— Unverified 0Assessment of Subjective and Objective Quality of Live Streaming Sports Videos Jun 15, 2021 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Knowledge Acquisition for Visual Question Answering via Iterative Querying Jul 1, 2017 Question Answering Visual Question Answering
— Unverified 0Knowledge-Based Counterfactual Queries for Visual Question Answering Mar 5, 2023 counterfactual Decision Making
— Unverified 0Generating Triples with Adversarial Networks for Scene Graph Construction Feb 7, 2018 Attribute graph construction
— Unverified 0Generating Rationales in Visual Question Answering Apr 4, 2020 Question Answering Visual Question Answering
— Unverified 0Knowledge Condensation and Reasoning for Knowledge-based VQA Mar 15, 2024 Question Answering Reading Comprehension
— Unverified 0Assessing Visual Quality of Omnidirectional Videos Jul 14, 2019 Visual Question Answering (VQA)
— Unverified 0Knowledge Detection by Relevant Question and Image Attributes in Visual Question Answering Jun 8, 2023 Question Answering Retrieval
— Unverified 0Ontology-based knowledge representation for bone disease diagnosis: a foundation for safe and sustainable medical artificial intelligence systems Jun 5, 2025 Diagnostic Multimodal Deep Learning
— Unverified 0Generating Natural Questions from Images for Multimodal Assistants Nov 17, 2020 Common Sense Reasoning Natural Questions
— Unverified 0Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention Feb 15, 2019 Explanation Generation Language Modeling
— Unverified 0KNVQA: A Benchmark for evaluation knowledge-based VQA Nov 21, 2023 Hallucination Object Hallucination
— Unverified 0COIN: Counterfactual Image Generation for VQA Interpretation Jan 10, 2022 counterfactual Image Generation
— Unverified 0Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge May 30, 2023 Answer Selection Question Answering
— Unverified 0Assessing the Robustness of Visual Question Answering Models Nov 30, 2019 Question Answering Visual Question Answering
— Unverified 0Generalized Hadamard-Product Fusion Operators for Visual Question Answering Mar 26, 2018 Neural Architecture Search Question Answering
— Unverified 0Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems Oct 26, 2022 Question Answering Visual Question Answering
— Unverified 0Assessing Image Quality Issues for Real-World Problems Mar 27, 2020 Image Captioning Question Answering
— Unverified 0Aligned Dual Channel Graph Convolutional Network for Visual Question Answering Jul 1, 2020 Question Answering Visual Question Answering
— Unverified 0LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound Oct 19, 2024 Instruction Following Knowledge Distillation
— Unverified 0LLM4VG: Large Language Models Evaluation for Video Grounding Dec 21, 2023 Image Captioning Video Grounding
— Unverified 0Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs Apr 30, 2025 Hallucination Hallucination Evaluation
— Unverified 0Language bias in Visual Question Answering: A Survey and Taxonomy Nov 16, 2021 Question Answering Visual Question Answering
— Unverified 0Gender and Racial Bias in Visual Question Answering Datasets May 17, 2022 Question Answering Visual Question Answering
— Unverified 0