PrefScore: Pairwise Preference Learning for Reference-free Single-document Summarization Quality Assessment
2022-01-16ACL ARR January 2022Unverified0· sign in to hype
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Evaluating machine-generated summaries without a human-written reference summary has been a need for a long time. Inspired by preference labeling in existing works of summarization evaluation, we propose to judge summary quality by learning the preference rank of summaries using the Bradley-Terry power ranking model from generated inferior summaries of a base summary. Despite the simplicity of our method, extensive experiments on several datasets show that our weakly supervised scheme can produce scores highly correlate with human ratings.