SOTAVerified

Ensemble Self-Training for Low-Resource Languages: Grapheme-to-Phoneme Conversion and Morphological Inflection

2020-07-01WS 2020Unverified0· sign in to hype

Xiang Yu, Ngoc Thang Vu, Jonas Kuhn

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We present an iterative data augmentation framework, which trains and searches for an optimal ensemble and simultaneously annotates new training data in a self-training style. We apply this framework on two SIGMORPHON 2020 shared tasks: grapheme-to-phoneme conversion and morphological inflection. With very simple base models in the ensemble, we rank the first and the fourth in these two tasks. We show in the analysis that our system works especially well on low-resource languages.

Tasks

Reproductions