Neural Machine Translation of Text from Non-Native Speakers

2018-08-19NAACL 2019Code Available0· sign in to hype

Antonios Anastasopoulos, Alison Lui, Toan Nguyen, David Chiang

Code Available — Be the first to reproduce this paper.

Code

bitbucket.org/antonis/nmt-grammar-noise
OfficialIn papernone★ 0
github.com/tnq177/nmt_text_from_non_native_speaker
pytorch★ 0

Abstract

Neural Machine Translation (NMT) systems are known to degrade when confronted with noisy data, especially when the system is trained only on clean data. In this paper, we show that augmenting training data with sentences containing artificially-introduced grammatical errors can make the system more robust to such errors. In combination with an automatic grammar error correction system, we can recover 1.5 BLEU out of 2.4 BLEU lost due to grammatical errors. We also present a set of Spanish translations of the JFLEG grammar error correction corpus, which allows for testing NMT robustness to real grammatical errors.

Tasks

Machine Translation NMT Translation

Neural Machine Translation of Text from Non-Native Speakers

Code

Abstract

Tasks

Reproductions