Two Methods for Domain Adaptation of Bilingual Tasks: Delightfully Simple and Broadly Applicable
Viktor Hangya, Fabienne Braune, Alex Fraser, er, Hinrich Sch{\"u}tze
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/hangyav/biadaptOfficialIn papertf★ 0
Abstract
Bilingual tasks, such as bilingual lexicon induction and cross-lingual classification, are crucial for overcoming data sparsity in the target language. Resources required for such tasks are often out-of-domain, thus domain adaptation is an important problem here. We make two contributions. First, we test a delightfully simple method for domain adaptation of bilingual word embeddings. We evaluate these embeddings on two bilingual tasks involving different domains: cross-lingual twitter sentiment classification and medical bilingual lexicon induction. Second, we tailor a broadly applicable semi-supervised classification method from computer vision to these tasks. We show that this method also helps in low-resource setups. Using both methods together we achieve large improvements over our baselines, by using only additional unlabeled data.