Finding the Optimal Vocabulary Size for Neural Machine Translation

2020-04-05Findings of the Association for Computational LinguisticsCode Available0· sign in to hype

Thamme Gowda, Jonathan May

Code Available — Be the first to reproduce this paper.

Code

github.com/thammegowda/005-nmt-imbalance
OfficialIn papernone★ 6

Abstract

We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components. Classifiers are known to perform better with balanced class distributions during training. Since the Zipfian nature of languages causes imbalanced classes, we explore its effect on NMT. We analyze the effect of various vocabulary sizes on NMT performance on multiple languages with many data sizes, and reveal an explanation for why certain vocabulary sizes are better than others.

Tasks

Classification General Classification Machine Translation NMT Translation

Finding the Optimal Vocabulary Size for Neural Machine Translation

Code

Abstract

Tasks

Reproductions