SOTAVerified

Data-to-text Generation by Splicing Together Nearest Neighbors

2021-01-20EMNLP 2021Code Available1· sign in to hype

Sam Wiseman, Arturs Backurs, Karl Stratos

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.

Tasks

Reproductions