Non-Autoregressive Models are Better Multilingual Translators

2021-09-29ICLR 2022Unverified0· sign in to hype

Zhenqiao Song, Hao Zhou, Lihua Qian, Jingjing Xu, Shanbo Cheng, Mingxuan Wang, Lei LI

Unverified — Be the first to reproduce this paper.

Abstract

Multilingual machine translation aims to develop a single model for multiple language directions. However, existing multilingual models based on Transformer are limited in terms of both translation performance and inference speed. In this paper, we propose switch-GLAT, a non-autoregressive multilingual machine translation model with a code-switch decoder. It can generate contextual code-switched translations for a given source sentence, and perform code-switch back-translation, greatly boosting multilingual translation performance. In addition, its inference is highly efficient thanks to its parallel decoder. Experiments show that our proposed switch-GLAT outperform the multilingual Transformer with as much as 1.16 BLEU improvement and 6.6x faster decoding speed in inference.

Tasks

Decoder Machine Translation Sentence Translation

Non-Autoregressive Models are Better Multilingual Translators

Abstract

Tasks

Reproductions