GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

2020-09-29ICLR 2021Code Available1· sign in to hype

Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/taoyds/grappa
pytorch★ 31

Abstract

We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG) induced from existing text-to-SQL datasets. We pre-train our model on the synthetic data using a novel text-schema linking objective that predicts the syntactic role of a table field in the SQL for each question-SQL pair. To maintain the model's ability to represent real-world data, we also include masked language modeling (MLM) over several existing table-and-language datasets to regularize the pre-training process. On four popular fully supervised and weakly supervised table semantic parsing benchmarks, GraPPa significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.

Tasks

Inductive Bias Language Modeling Language Modelling Masked Language Modeling Semantic Parsing Text to SQL Text-To-SQL

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
spider	RATSQL + Grammar-Augmented Pre-Training	Accuracy	69.6	—	Unverified

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Code

Abstract

Tasks

Benchmark Results

Reproductions