TNT-NLG, System 1: Using a statistical NLG to massively augment crowd-sourced data for neural generation

2018-04-26E2E NLG Challenge System Descriptions 2018Unverified0· sign in to hype

Shereen Oraby, Lena Reed, Shubhangi Tandon, Stephanie Lukin, Marilyn A. Walker

Unverified — Be the first to reproduce this paper.

Abstract

Ever since the successful application of sequence to sequence learning for neural machine translation systems (Sutskever et al., 2014), interest has surged in its applicability towards language generation in other problem domains. In the area of natural language generation (NLG), there has been a great deal of interest in end-to-end (E2E) neural models that learn and generate natural language sentence realizations in one step. In this paper, we present TNT-NLG System 1, our first system submission to the E2E NLG Challenge, where we generate natural language (NL) realizations from meaning representations (MRs) in the restaurant domain by massively expanding the training dataset. We develop two models for this system, based on Dusek et al.’s (2016a) open source baseline model and context-aware neural language generator. Starting with the MR and NL pairs from the E2E generation challenge dataset, we explode the size of the training set using PERSONAGE (Mairesse and Walker, 2010), a statistical generator able to produce varied realizations from MRs, and use our expanded data as contextual input into our models. We present evaluation results using automated and human evaluation metrics, and describe directions for future work.

Tasks

Data-to-Text Generation Machine Translation Sentence Text Generation Translation

TNT-NLG, System 1: Using a statistical NLG to massively augment crowd-sourced data for neural generation

Abstract

Tasks

Reproductions