BERTology for Machine Translation: What BERT Knows about Linguistic Difficulties for Translation

2022-06-01LREC 2022Unverified0· sign in to hype

Yuqian Dai, Marc de Kamps, Serge Sharoff

Unverified — Be the first to reproduce this paper.

Abstract

Pre-trained transformer-based models, such as BERT, have shown excellent performance in most natural language processing benchmark tests, but we still lack a good understanding of the linguistic knowledge of BERT in Neural Machine Translation (NMT). Our work uses syntactic probes and Quality Estimation (QE) models to analyze the performance of BERT’s syntactic dependencies and their impact on machine translation quality, exploring what kind of syntactic dependencies are difficult for NMT engines based on BERT. While our probing experiments confirm that pre-trained BERT “knows” about syntactic dependencies, its ability to recognize them often decreases after fine-tuning for NMT tasks. We also detect a relationship between syntactic dependencies in three languages and the quality of their translations, which shows which specific syntactic dependencies are likely to be a significant cause of low-quality translations.

Tasks

Machine Translation NMT Translation

BERTology for Machine Translation: What BERT Knows about Linguistic Difficulties for Translation

Abstract

Tasks

Reproductions