Handling Rare Items in Data-to-Text Generation

2018-11-01WS 2018Code Available0· sign in to hype

Anastasia Shimorina, Claire Gardent

Code Available — Be the first to reproduce this paper.

Code

gitlab.com/shimorina/inlg-2018
OfficialIn papernone★ 0
gitlab.com/shimorina/webnlg-dataset
OfficialIn papertf★ 0

Abstract

Neural approaches to data-to-text generation generally handle rare input items using either delexicalisation or a copy mechanism. We investigate the relative impact of these two methods on two datasets (E2E and WebNLG) and using two evaluation settings. We show (i) that rare items strongly impact performance; (ii) that combining delexicalisation and copying yields the strongest improvement; (iii) that copying underperforms for rare and unseen items and (iv) that the impact of these two mechanisms greatly varies depending on how the dataset is constructed and on how it is split into train, dev and test.

Tasks

Data-to-Text Generation KG-to-Text Generation Text Generation

Handling Rare Items in Data-to-Text Generation

Code

Abstract

Tasks

Reproductions