SOTAVerified

An empirical analysis of existing systems and datasets toward general simple question answering

2020-12-01COLING 2020Code Available0· sign in to hype

Namgi Han, Goran Topic, Hiroshi Noji, Hiroya Takamura, Yusuke Miyao

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this paper, we evaluate the progress of our field toward solving simple factoid questions over a knowledge base, a practically important problem in natural language interface to database. As in other natural language understanding tasks, a common practice for this task is to train and evaluate a model on a single dataset, and recent studies suggest that SimpleQuestions, the most popular and largest dataset, is nearly solved under this setting. However, this common setting does not evaluate the robustness of the systems outside of the distribution of the used training data. We rigorously evaluate such robustness of existing systems using different datasets. Our analysis, including shifting of training and test datasets and training on a union of the datasets, suggests that our progress in solving SimpleQuestions dataset does not indicate the success of more general simple question answering. We discuss a possible future direction toward this goal.

Tasks

Reproductions