SOTAVerified

Sparsely Factored Neural Machine Translation

2021-02-17Code Available0· sign in to hype

Noe Casas, Jose A. R. Fonollosa, Marta R. Costa-jussà

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

The standard approach to incorporate linguistic information to neural machine translation systems consists in maintaining separate vocabularies for each of the annotated features to be incorporated (e.g. POS tags, dependency relation label), embed them, and then aggregate them with each subword in the word they belong to. This approach, however, cannot easily accommodate annotation schemes that are not dense for every word. We propose a method suited for such a case, showing large improvements in out-of-domain data, and comparable quality for the in-domain data. Experiments are performed in morphologically-rich languages like Basque and German, for the case of low-resource scenarios.

Tasks

Reproductions