Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation

2016-10-01AMTA 2016Unverified0· sign in to hype

Kenji Imamura, Eiichiro Sumita

Unverified — Be the first to reproduce this paper.

Abstract

Domain adaptation is a major challenge when applying machine translation to practical tasks. In this paper, we present domain adaptation methods for machine translation that assume multiple domains. The proposed methods combine two model types: a corpus-concatenated model covering multiple domains and single-domain models that are accurate but sparse in specific domains. We combine the advantages of both models using feature augmentation for domain adaptation in machine learning. Our experimental results show that the BLEU scores of the proposed method clearly surpass those of single-domain models for low-resource domains. For high-resource domains, the scores of the proposed method were superior to those of both single-domain and corpusconcatenated models. Even in domains having a million bilingual sentences, the translation quality was at least preserved and even improved in some domains. These results demonstrate that state-of-the-art domain adaptation can be realized with appropriate settings, even when using standard log-linear models.

Tasks

Domain Adaptation Machine Translation Translation

Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation

Abstract

Tasks

Reproductions