SOTAVerified

Building a Better Bitext for Structurally Different Languages through Self-training

2017-11-01WS 2017Unverified0· sign in to hype

Jungyeul Park, Lo{\"\i}c Dugast, Jeen-Pyo Hong, Chang-Uk Shin, Jeong-Won Cha

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We propose a novel method to bootstrap the construction of parallel corpora for new pairs of structurally different languages. We do so by combining the use of a pivot language and self-training. A pivot language enables the use of existing translation models to bootstrap the alignment and a self-training procedure enables to achieve better alignment, both at the document and sentence level. We also propose several evaluation methods for the resulting alignment.

Tasks

Reproductions