WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation

2021-08-01ACL 2021Unverified0· sign in to hype

Nachshon Cohen, Oren Kalinsky, Yftah Ziser, Alessandro Moschitti

Unverified — Be the first to reproduce this paper.

Abstract

Recent works made significant advances on summarization tasks, facilitated by summarization datasets. Several existing datasets have the form of coherent-paragraph summaries. However, these datasets were curated from academic documents that were written for experts, thus making the essential step of assessing the summarization output through human-evaluation very demanding. To overcome these limitations, we present a dataset based on article summaries appearing on the WikiHow website, composed of how-to articles and coherent-paragraph summaries written in plain language. We compare our dataset attributes to existing ones, including readability and world-knowledge, showing our dataset makes human evaluation significantly easier and thus, more effective. A human evaluation conducted on PubMed and the proposed dataset reinforces our findings.

Tasks

Articles World Knowledge

WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation

Abstract

Tasks

Reproductions