Data Diversification: A Simple Strategy For Neural Machine Translation

2019-11-05NeurIPS 2020Code Available1· sign in to hype

Xuan-Phi Nguyen, Shafiq Joty, Wu Kui, Ai Ti Aw

Code Available — Be the first to reproduce this paper.

Code

github.com/nxphi47/data_diversification
OfficialIn paperpytorch★ 24
github.com/nxphi47/multiagent_crosstranslate
pytorch★ 5

Abstract

We introduce Data Diversification: a simple but effective strategy to boost neural machine translation (NMT) performance. It diversifies the training data by using the predictions of multiple forward and backward models and then merging them with the original dataset on which the final NMT model is trained. Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles of models. Our method achieves state-of-the-art BLEU scores of 30.7 and 43.7 in the WMT'14 English-German and English-French translation tasks, respectively. It also substantially improves on 8 other translation tasks: 4 IWSLT tasks (English-German and English-French) and 4 low-resource translation tasks (English-Nepali and English-Sinhala). We demonstrate that our method is more effective than knowledge distillation and dual learning, it exhibits strong correlation with ensembles of models, and it trades perplexity off for better BLEU score. We have released our source code at https://github.com/nxphi47/data_diversification

Tasks

Knowledge Distillation Machine Translation NMT Translation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
IWSLT2014 German-English	Data Diversification	BLEU score	37.2	—	Unverified
WMT2014 English-German	Data Diversification - Transformer	BLEU score	30.7	—	Unverified

Data Diversification: A Simple Strategy For Neural Machine Translation

Code

Abstract

Tasks

Benchmark Results

Reproductions