Are We Evaluating Paraphrase Generation Accurately?

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Paraphrase is a restatement of a text that conveys the same meaning using different expressions. The evaluation of paraphrase generation (PG) is a complex task and currently lacks a complete picture of the criteria and metrics. In this paper, we survey the automatic evaluation metrics and human evaluation criteria of PG evaluation. Base on the survey result, we propose a reference-free automatic toolkit and list clear human evaluation criteria. Moreover, we notice the paraphrases selection in downstream tasks and propose a simple but effective evaluation Filter model. It can fusion multi automatic metrics to fit the human evaluation without any references.

Tasks

Paraphrase Generation Survey

Are We Evaluating Paraphrase Generation Accurately?

Abstract

Tasks

Reproductions