Breaking Down Questions for Outside-Knowledge Visual Question Answering

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

There is a recent trend towards Knowledge-Based VQA (KB-VQA) where different aspects of the question require different sources of knowledge including the image's visual content and external knowledge such as commonsense concepts and factual information. To address this issue, we propose a novel approach that passes knowledge from various sources between different pieces of semantic content in the question. Questions are first segmented into several chunks, and each segment is used to generate queries to retrieve knowledge from ConceptNet and Wikipedia. Then, a graph neural network, taking advantage of the question's syntactic structure, integrates the knowledge for different segments to jointly predict the answer. Our experiments on the OK-VQA dataset show that our approach achieves new state-of-the-art results.

Tasks

Graph Neural Network Question Answering Visual Question Answering Visual Question Answering (VQA)

Breaking Down Questions for Outside-Knowledge Visual Question Answering

Abstract

Tasks

Reproductions