Effective Architectures for Low Resource Multilingual Named Entity Transliteration

2020-12-01loresmt (AACL) 2020Unverified0· sign in to hype

Molly Moran, Constantine Lignos

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we evaluate LSTM, biLSTM, GRU, and Transformer architectures for the task of name transliteration in a many-to-one multilingual paradigm, transliterating from 590 languages to English. We experiment with different encoder-decoder combinations and evaluate them using accuracy, character error rate, and an F-measure based on longest continuous subsequences. We find that using a Transformer for the encoder and decoder performs best, improving accuracy by over 4 points compared to previous work. We explore whether manipulating the source text by adding macrolanguage flag tokens or pre-romanizing source strings can improve performance and find that neither manipulation has a positive effect. Finally, we analyze performance differences between the LSTM and Transformer encoders when using a Transformer decoder and find that the Transformer encoder is better able to handle insertions and substitutions when transliterating.

Tasks

Decoder Transliteration

Effective Architectures for Low Resource Multilingual Named Entity Transliteration

Abstract

Tasks

Reproductions