Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

2017-08-31ICLR 2018Code Available1· sign in to hype

Victor Zhong, Caiming Xiong, Richard Socher

Code Available — Be the first to reproduce this paper.

Code

github.com/salesforce/WikiSQL
OfficialIn papernone★ 1,802
github.com/kasnerz/tabgenie
none★ 58
github.com/ist-daslab/rosa
pytorch★ 44
github.com/Baidi96/text2sql
pytorch★ 0
github.com/abhishekchugh17/sql12
pytorch★ 0
github.com/tiwarikajal/Seq2SQL-
pytorch★ 0
github.com/xiaojunxu/SQLNet
pytorch★ 0
github.com/llSourcell/SQL_Database_Optimization
pytorch★ 0
github.com/racheljose21/chatbot
pytorch★ 0
github.com/PriyankaDatar/NLP_SQL_Project
pytorch★ 0

Abstract

A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. Our model leverages the structure of SQL queries to significantly reduce the output space of generated queries. Moreover, we use rewards from in-the-loop query execution over the database to learn a policy to generate unordered parts of the query, which we show are less suitable for optimization via cross entropy loss. In addition, we will publish WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia. This dataset is required to train our model and is an order of magnitude larger than comparable datasets. By applying policy-based reinforcement learning with a query execution environment to WikiSQL, our model Seq2SQL outperforms attentional sequence to sequence models, improving execution accuracy from 35.9% to 59.4% and logical form accuracy from 23.4% to 48.3%.

Tasks

reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)Text-To-SQL

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
WikiSQL	Seq2SQL (Zhong et al., 2017)	Execution Accuracy	59.4	—	Unverified
WikiSQL	Seq2Seq (Zhong et al., 2017)	Execution Accuracy	35.9	—	Unverified

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

Code

Abstract

Tasks

Benchmark Results

Reproductions