SOTAVerified

Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence

2024-02-15Code Available0· sign in to hype

Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Recent large language models (LLMs) have shown remarkable performance in aligning generated text with user intentions across various tasks. When it comes to long-form text generation, there has been a growing interest in generation from a discourse coherence perspective. However, existing lexical or semantic metrics such as BLEU, ROUGE, BertScore cannot effectively capture the discourse coherence. The development of discourse-specific automatic evaluation methods for assessing the output of LLMs warrants greater focus and exploration. In this paper, we present a novel automatic metric designed to quantify the discourse divergence between two long-form articles. Extensive experiments on three datasets from representative domains demonstrate that our metric aligns more closely with human preferences and GPT-4 coherence evaluation, outperforming existing evaluation methods.

Tasks

Reproductions