Nearest Neighbor Machine Translation is Meta-Optimizer on Output Projection Layer

2023-05-22Code Available0· sign in to hype

Ruize Gao, Zhirui Zhang, Yichao Du, Lemao Liu, Rui Wang

Code Available — Be the first to reproduce this paper.

Code

github.com/ruizgao/knnmt-meta-optimizer
OfficialIn paperpytorch★ 6

Abstract

Nearest Neighbor Machine Translation (kNN-MT) has achieved great success in domain adaptation tasks by integrating pre-trained Neural Machine Translation (NMT) models with domain-specific token-level retrieval. However, the reasons underlying its success have not been thoroughly investigated. In this paper, we comprehensively analyze kNN-MT through theoretical and empirical studies. Initially, we provide new insights into the working mechanism of kNN-MT as an efficient technique to implicitly execute gradient descent on the output projection layer of NMT, indicating that it is a specific case of model fine-tuning. Subsequently, we conduct multi-domain experiments and word-level analysis to examine the differences in performance between kNN-MT and entire-model fine-tuning. Our findings suggest that: (1) Incorporating kNN-MT with adapters yields comparable translation performance to fine-tuning on in-domain test sets, while achieving better performance on out-of-domain test sets; (2) Fine-tuning significantly outperforms kNN-MT on the recall of in-domain low-frequency words, but this gap could be bridged by optimizing the context representations with additional adapter layers.

Tasks

Domain Adaptation Machine Translation NMT Retrieval Translation

Nearest Neighbor Machine Translation is Meta-Optimizer on Output Projection Layer

Code

Abstract

Tasks

Reproductions