Breaking Down Questions for Outside-Knowledge VQA

2021-09-29Unverified0· sign in to hype

Jialin Wu, Ray Mooney

Unverified — Be the first to reproduce this paper.

Abstract

While general Visual Question Answering (VQA) focuses on querying visual content within an image, there is a recent trend towards Knowledge-Based VQA (KB-VQA) where a system needs to link some aspects of the question to different types of knowledge beyond the image, such as commonsense concepts and factual information. To address this issue, we propose a novel approach that passes knowledge from various sources between different pieces of semantic content in the question. Questions are first segmented into several chunks, and each segment is used as a key to retrieve knowledge from ConceptNet and Wikipedia. Then, a graph neural network, taking advantage of the question's syntactic structure, integrates the knowledge for different segments to jointly predict the answer. Our experiments on the OK-VQA dataset show that our approach achieves new state-of-the-art results.

Tasks

Graph Neural Network Question Answering Visual Question Answering Visual Question Answering (VQA)

Breaking Down Questions for Outside-Knowledge VQA

Abstract

Tasks

Reproductions