On the interaction of automatic evaluation and task framing in headline style transfer
2021-01-05ACL (EvalNLGEval, INLG) 2020Code Available0· sign in to hype
Lorenzo De Mattei, Michele Cafagna, Huiyuan Lai, Felice Dell'Orletta, Malvina Nissim, Albert Gatt
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/michelecafagna26/CHANGE-ITOfficialIn papernone★ 4
Abstract
An ongoing debate in the NLG community concerns the best way to evaluate systems, with human evaluation often being considered the most reliable method, compared to corpus-based metrics. However, tasks involving subtle textual differences, such as style transfer, tend to be hard for humans to perform. In this paper, we propose an evaluation method for this task based on purposely-trained classifiers, showing that it better reflects system differences than traditional metrics such as BLEU and ROUGE.