Machine Translated Text Detection Through Text Similarity with Round-Trip Translation
Hoang-Quoc Nguyen-Son, Tran Thao, Seira Hidano, Ishita Gupta, Shinsaku Kiyomoto
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/quocnsh/machine_translation_detectionOfficialIn papertf★ 2
Abstract
Translated texts have been used for malicious purposes, i.e., plagiarism or fake reviews. Existing detectors have been built around a specific translator (e.g., Google) but fail to detect a translated text from a strange translator. If we use the same translator, the translated text is similar to its round-trip translation, which is when text is translated into another language and translated back into the original language. However, a round-trip translated text is significantly different from the original text or a translated text using a strange translator. Hence, we propose a detector using text similarity with round-trip translation (TSRT). TSRT achieves 86.9\% accuracy in detecting a translated text from a strange translator. It outperforms existing detectors (77.9\%) and human recognition (53.3\%).