SOTAVerified

Can Synthetic Translations Improve Bitext Quality?

2021-10-16ACL ARR October 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Synthetic translations have been used for a wide range of NLP tasks primarily as a means of data augmentation. This work explores instead, how we can use synthetic translations to selectively replace potentially imperfect reference translations in mined bitext. We find that synthetic samples can improve bitext quality without any additional bilingual supervision, when they replace the originals based on a semantic equivalence classifier that helps mitigate NMT noise. The improved quality of the revised bitext is confirmed intrinsically via human evaluation and extrinsically through bilingual induction and MT tasks.

Tasks

Reproductions