Controlling Text Complexity in Neural Machine Translation
2019-11-03IJCNLP 2019Code Available0· sign in to hype
Sweta Agrawal, Marine Carpuat
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/sweta20/ComplexityControlledMTOfficialIn papernone★ 0
Abstract
This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency. We collect a high quality dataset of news articles available in English and Spanish, written for diverse grade levels and propose a method to align segments across comparable bilingual articles. The resulting dataset makes it possible to train multi-task sequence-to-sequence models that translate Spanish into English targeted at an easier reading grade level than the original Spanish. We show that these multi-task models outperform pipeline approaches that translate and simplify text independently.