LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Dec 29, 2020 Document Image Classification Document Layout Analysis
Code Code Available 0Object-Centric Diagnosis of Visual Reasoning Dec 21, 2020 Diagnostic Object
— Unverified 0Learning content and context with language bias for Visual Question Answering Dec 21, 2020 Question Answering Visual Question Answering
Code Code Available 0KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA Dec 20, 2020 Visual Question Answering (VQA)
— Unverified 0On Modality Bias in the TVQA Dataset Dec 18, 2020 Question Answering Video Question Answering
Code Code Available 0Trying Bilinear Pooling in Video-QA Dec 18, 2020 Question Answering Video Question Answering
— Unverified 0KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning Dec 13, 2020 Sentence Visual Commonsense Reasoning
— Unverified 0Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Dec 9, 2020 Decoder Image Captioning
— Unverified 0Study on the Assessment of the Quality of Experience of Streaming Video Dec 8, 2020 regression Video Quality Assessment
Code Code Available 0Understanding Guided Image Captioning Performance across Domains Dec 4, 2020 Descriptive Image Captioning
Code Code Available 0WeaQA: Weak Supervision via Captions for Visual Question Answering Dec 4, 2020 Question Answering Visual Question Answering
— Unverified 0Multimodal Graph Networks for Compositional Generalization in Visual Question Answering Dec 1, 2020 Graph Neural Network Question Answering
— Unverified 0Open-Ended Multi-Modal Relational Reasoning for Video Question Answering Dec 1, 2020 Question Answering Relational Reasoning
Code Code Available 0A Unified Framework for Multilingual and Code-Mixed Visual Question Answering Dec 1, 2020 Question Answering Visual Question Answering
— Unverified 0Towards Knowledge-Augmented Visual Question Answering Dec 1, 2020 General Knowledge Graph Attention
Code Code Available 0Learning from Lexical Perturbations for Consistent Visual Question Answering Nov 26, 2020 Question Answering Visual Question Answering
Code Code Available 0Siamese Tracking with Lingual Object Constraints Nov 23, 2020 Object Object Tracking
Code Code Available 0Interpretable Visual Reasoning via Induced Symbolic Space Nov 23, 2020 Visual Question Answering (VQA) Visual Reasoning
Code Code Available 0Modular Graph Attention Network for Complex Visual Relational Reasoning Nov 22, 2020 Graph Attention Question Answering
— Unverified 0Logically Consistent Loss for Visual Question Answering Nov 19, 2020 Multi-Task Learning Question Answering
— Unverified 0Generating Natural Questions from Images for Multimodal Assistants Nov 17, 2020 Common Sense Reasoning Natural Questions
— Unverified 0CapWAP: Captioning with a Purpose Nov 9, 2020 Image Captioning Question Answering
— Unverified 0Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles Nov 7, 2020 Natural Language Inference Question Answering
Code Code Available 0An Improved Attention for Visual Question Answering Nov 4, 2020 Decoder Question Answering
Code Code Available 0Reasoning Over History: Context Aware Visual Dialog Nov 2, 2020 coreference-resolution Coreference Resolution
— Unverified 0Can Pre-training help VQA with Lexical Variations? Nov 1, 2020 Question Answering Visual Question Answering
— Unverified 0Representation, Learning and Reasoning on Spatial Language for Downstream NLP Tasks Nov 1, 2020 Common Sense Reasoning Question Answering
— Unverified 0STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering Nov 1, 2020 Chart Question Answering Question Answering
— Unverified 0CapWAP: Image Captioning with a Purpose Nov 1, 2020 Image Captioning Question Answering
— Unverified 0ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention Nov 1, 2020 Multiple-choice Question Answering
— Unverified 0Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View Oct 30, 2020 Face Recognition image-classification
Code Code Available 0Leveraging Visual Question Answering to Improve Text-to-Image Synthesis Oct 28, 2020 Auxiliary Learning Image Generation
— Unverified 0Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions Oct 24, 2020 General Classification Multiple-choice
— Unverified 0SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency Oct 20, 2020 Question Answering Visual Grounding
Code Code Available 0Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question Answering Oct 17, 2020 Question Answering Visual Question Answering
— Unverified 0New Ideas and Trends in Deep Multimodal Content Understanding: A Review Oct 16, 2020 Cross-Modal Retrieval Deep Learning
— Unverified 0Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think! Oct 13, 2020 Diagnostic Image-text Classification
— Unverified 0Interpretable Neural Computation for Real-World Compositional Visual Question Answering Oct 10, 2020 Question Answering Visual Question Answering
— Unverified 0Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset Oct 8, 2020 Question Answering Visual Question Answering
— Unverified 0Finding the Evidence: Localization-aware Answer Prediction for Text Visual Question Answering Oct 6, 2020 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Pathological Visual Question Answering Oct 6, 2020 AI Agent Question Answering
— Unverified 0Attention Guided Semantic Relationship Parsing for Visual Question Answering Oct 5, 2020 Object Question Answering
— Unverified 0CAPTION: Correction by Analyses, POS-Tagging and Interpretation of Objects using only Nouns Oct 2, 2020 Image Captioning object-detection
— Unverified 0ISAAQ -- Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention Oct 1, 2020 Multiple-choice Question Answering
— Unverified 0Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network Sep 30, 2020 Heuristic Search Question Answering
— Unverified 0Spatial Attention as an Interface for Image Captioning Models Sep 29, 2020 Image Captioning Question Answering
— Unverified 0Hierarchical Deep Multi-modal Network for Medical Visual Question Answering Sep 27, 2020 Descriptive Medical Visual Question Answering
Code Code Available 0Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering Sep 23, 2020 Question Answering Visual Question Answering
Code Code Available 0Regularizing Attention Networks for Anomaly Detection in Visual Question Answering Sep 21, 2020 Anomaly Detection Question Answering
— Unverified 0A Multimodal Memes Classification: A Survey and Open Research Issues Sep 17, 2020 Classification General Classification
— Unverified 0