SOTAVerified

Can We Use Word Embeddings for Enhancing Guarani-Spanish Machine Translation?

2022-05-01ComputEL (ACL) 2022Code Available0· sign in to hype

Santiago Góngora, Nicolás Giossa, Luis Chiruzzo

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Machine translation for low-resource languages, such as Guarani, is a challenging task due to the lack of data. One way of tackling it is using pretrained word embeddings for model initialization. In this work we try to check if currently available data is enough to train rich embeddings for enhancing MT for Guarani and Spanish, by building a set of word embedding collections and training MT systems using them. We found that the trained vectors are strong enough to slightly improve the performance of some of the translation models and also to speed up the training convergence.

Tasks

Reproductions