Dataset for a Neural Natural Language Interface for Databases (NNLIDB)

2017-07-11IJCNLP 2017Code Available0· sign in to hype

Florin Brad, Radu Iacob, Ionel Hosu, Traian Rebedea

Code Available — Be the first to reproduce this paper.

Code

github.com/johnthebrave/nlidb-datasets
tf★ 20

Abstract

Progress in natural language interfaces to databases (NLIDB) has been slow mainly due to linguistic issues (such as language ambiguity) and domain portability. Moreover, the lack of a large corpus to be used as a standard benchmark has made data-driven approaches difficult to develop and compare. In this paper, we revisit the problem of NLIDBs and recast it as a sequence translation problem. To this end, we introduce a large dataset extracted from the Stack Exchange Data Explorer website, which can be used for training neural natural language interfaces for databases. We also report encouraging baseline results on a smaller manually annotated test corpus, obtained using an attention-based sequence-to-sequence neural network.

Tasks

Translation

Dataset for a Neural Natural Language Interface for Databases (NNLIDB)

Code

Abstract

Tasks

Reproductions