Context-Interactive Pre-Training for Document Machine Translation
Pengcheng Yang, Pei Zhang, Boxing Chen, Jun Xie, Weihua Luo
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Document machine translation aims to translate the source sentence into the target language in the presence of additional contextual information. However, it typically suffers from a lack of doc-level bilingual data. To remedy this, here we propose a simple yet effective context-interactive pre-training approach, which targets benefiting from external large-scale corpora. The proposed model performs inter sentence generation to capture the cross-sentence dependency within the target document, and cross sentence translation to make better use of valuable contextual information. Comprehensive experiments illustrate that our approach can achieve state-of-the-art performance on three benchmark datasets, which significantly outperforms a variety of baselines.