Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics

2024-10-08Code Available0· sign in to hype

Théo Gigant, Camille Guinaudeau, Marc Decombas, Frédéric Dufaux

Code Available — Be the first to reproduce this paper.

Code

github.com/giganttheo/importance-based-relevance-score
OfficialIn papernone★ 1

Abstract

Automatic metrics are used as proxies to evaluate abstractive summarization systems when human annotations are too expensive. To be useful, these metrics should be fine-grained, show a high correlation with human annotations, and ideally be independent of reference quality; however, most standard evaluation metrics for summarization are reference-based, and existing reference-free metrics correlate poorly with relevance, especially on summaries of longer documents. In this paper, we introduce a reference-free metric that correlates well with human evaluated relevance, while being very cheap to compute. We show that this metric can also be used alongside reference-based metrics to improve their robustness in low quality reference settings.

Tasks

Abstractive Text Summarization

Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics

Code

Abstract

Tasks

Reproductions