Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation

2020-12-01COLING 2020Unverified0· sign in to hype

Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, Eiichiro Sumita

Unverified — Be the first to reproduce this paper.

Abstract

In this study, linguistic knowledge at different levels are incorporated into the neural machine translation (NMT) framework to improve translation quality for language pairs with extremely limited data. Integrating manually designed or automatically extracted features into the NMT framework is known to be beneficial. However, this study emphasizes that the relevance of the features is crucial to the performance. Specifically, we propose two methods, 1) self relevance and 2) word-based relevance, to improve the representation of features for NMT. Experiments are conducted on translation tasks from English to eight Asian languages, with no more than twenty thousand sentences for training. The proposed methods improve translation quality for all tasks by up to 3.09 BLEU points. Discussions with visualization provide the explainability of the proposed methods where we show that the relevance methods provide weights to features thereby enhancing their impact on low-resource machine translation.

Tasks

Low Resource NMT Machine Translation NMT Translation

Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation

Abstract

Tasks

Reproductions