The University of Sydney's Machine Translation System for WMT19

2019-06-30WS 2019Unverified0· sign in to hype

Liang Ding, DaCheng Tao

Unverified — Be the first to reproduce this paper.

Abstract

This paper describes the University of Sydney's submission of the WMT 2019 shared news translation task. We participated in the FinnishEnglish direction and got the best BLEU(33.0) score among all the participants. Our system is based on the self-attentional Transformer networks, into which we integrated the most recent effective strategies from academic research (e.g., BPE, back translation, multi-features data selection, data augmentation, greedy model ensemble, reranking, ConMBR system combination, and post-processing). Furthermore, we propose a novel augmentation method Cycle Translation and a data mixture strategy Big/Small parallel construction to entirely exploit the synthetic corpus. Extensive experiments show that adding the above techniques can make continuous improvements of the BLEU scores, and the best result outperforms the baseline (Transformer ensemble model trained with the original parallel corpus) by approximately 5.3 BLEU score, achieving the state-of-the-art performance.

Tasks

Data Augmentation Machine Translation Reranking Translation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
WMT2016 Finnish-English	CT+B/S construction	BLEU	32.4	—	Unverified
WMT2017 Finnish-English	CT+B/S construction	BLEU	35.5	—	Unverified
WMT 2018 Finnish-English	CT+B/S construction	BLEU	26.5	—	Unverified
WMT2019 Finnish-English	CT+B/S construction	BLEU	34.1	—	Unverified

The University of Sydney's Machine Translation System for WMT19

Abstract

Tasks

Benchmark Results

Reproductions