SOTAVerified

Neural Machine Translation for English--Kazakh with Morphological Segmentation and Synthetic Data

2019-08-01WS 2019Unverified0· sign in to hype

Antonio Toral, Lukas Edman, Galiya Yeshmagambetova, Jennifer Spenader

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper presents the systems submitted by the University of Groningen to the English-- Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore the potential benefits of (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English--Kazakh data and (iii) synthetic data, both for the source and for the target language. Our best submissions ranked second for Kazakh→English and third for English→Kazakh in terms of the BLEU automatic evaluation metric.

Tasks

Reproductions