Improving Non-Autoregressive Neural Machine Translation via Modeling Localness

2022-10-01COLING 2022Unverified0· sign in to hype

Yong Wang, Xinwei Geng

Unverified — Be the first to reproduce this paper.

Abstract

Non-autoregressive translation (NAT) models, which eliminate the sequential dependencies within the target sentence, have achieved remarkable inference speed, but suffer from inferior translation quality. Towards exploring the underlying causes, we carry out a thorough preliminary study on the attention mechanism, which demonstrates the serious weakness in capturing localness compared with conventional autoregressive translation (AT). In response to this problem, we propose to improve the localness of NAT models by explicitly introducing the information about surrounding words. Specifically, temporal convolutions are incorporated into both encoder and decoder sides to obtain localness-aware representations. Extensive experiments on several typical translation datasets show that the proposed method can achieve consistent and significant improvements over strong NAT baselines. Further analyses on the WMT14 En-De translation task reveal that compared with baselines, our approach accelerates the convergence in training and can achieve equivalent performance with a reduction of 70% training steps.

Tasks

Decoder Machine Translation Sentence Translation

Improving Non-Autoregressive Neural Machine Translation via Modeling Localness

Abstract

Tasks

Reproductions