Building a Better Bitext for Structurally Different Languages through Self-training
2017-11-01WS 2017Unverified0· sign in to hype
Jungyeul Park, Lo{\"\i}c Dugast, Jeen-Pyo Hong, Chang-Uk Shin, Jeong-Won Cha
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We propose a novel method to bootstrap the construction of parallel corpora for new pairs of structurally different languages. We do so by combining the use of a pivot language and self-training. A pivot language enables the use of existing translation models to bootstrap the alignment and a self-training procedure enables to achieve better alignment, both at the document and sentence level. We also propose several evaluation methods for the resulting alignment.