Maximum Proxy-Likelihood Estimation for Non-autoregressive Machine Translation

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Maximum Likelihood Estimation (MLE) is commonly used in machine translation, where models with higher likelihood are assumed to perform better in translation. However, this assumption does not hold in the non-autoregressive Transformers (NATs), a new family of translation models. In this paper, we present both theoretical and empirical analysis on why simply maximizing the likelihood does not produce a good NAT model. Based on the theoretical analysis, we propose Maximum Proxy-Likelihood Estimation (MPLE), a novel method to address the training issue in MLE. Additionally, MPLE provides a novel perspective to understand existing success in training NATs, namely much previous work can be regarded as implicitly optimizing our objective.

Tasks

Machine Translation Translation

Maximum Proxy-Likelihood Estimation for Non-autoregressive Machine Translation

Abstract

Tasks

Reproductions