SOTAVerified

All You Need is Source! A Study on Source-based Quality Estimation for Neural Machine Translation

2022-09-01AMTA 2022Unverified0· sign in to hype

Jon Cambra, Mara Nunziatini

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Segment-level Quality Estimation (QE) is an increasingly sought-after task in the Machine Translation (MT) industry. In recent years, it has experienced an impressive evolution not only thanks to the implementation of supervised models using source and hypothesis information, but also through the usage of MT probabilities. This work presents a different approach to QE where only the source segment and the Neural MT (NMT) training data are needed, making possible an approximation to translation quality before inference. Our work is based on the idea that NMT quality at a segment level depends on the similarity degree between the source segment to be translated and the engine’s training data. The features proposed measuring this aspect of data achieve competitive correlations with MT metrics and human judgment and prove to be advantageous for post-editing (PE) prioritization task with domain adapted engines.

Tasks

Reproductions