SOTAVerified

Data-adaptive Transfer Learning for Low-resource Translation: A Case Study in Haitian

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Multilingual transfer techniques often improve low-resource machine translation (MT). Many of these techniques are applied without considering data characteristics. We show in the context of Haitian-to-English translation that transfer effectiveness is correlated with amount of training data and relationships between knowledge-sharing languages. Our experiments suggest that beyond a threshold of authentic data, back-translation augmentation methods are counterproductive, while cross-lingual transfer during training is preferred. We complement this finding by contributing a rule-based French-Haitian orthographic and syntactic engine and a novel method for phonological embedding. When used with multilingual techniques, orthographic transformation significantly improves performance over conventional methods, and phonological transfer greatly improves performance in Jamaican MT.

Tasks

Reproductions