Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

2018-09-01WS 2018Code Available0· sign in to hype

Devendra Singh Sachan, Graham Neubig

Code Available — Be the first to reproduce this paper.

Code

github.com/DevSinghSachan/multilingual_nmt
OfficialIn paperpytorch★ 0

Abstract

In multilingual neural machine translation, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained models. However, these improvements are not uniform; often multilingual parameter sharing results in a decrease in accuracy due to translation models not being able to accommodate different languages in their limited parameter space. In this work, we examine parameter sharing techniques that strike a happy medium between full sharing and individual training, specifically focusing on the self-attentional Transformer model. We find that the full parameter sharing approach leads to increases in BLEU scores mainly when the target languages are from a similar language family. However, even in the case where target languages are from different families where full parameter sharing leads to a noticeable drop in BLEU scores, our proposed methods for partial sharing of parameters can lead to substantial improvements in translation accuracy.

Tasks

Machine Translation Translation

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Code

Abstract

Tasks

Reproductions