Discourse Coherence in the Wild: A Dataset, Evaluation and Methods

2018-05-14WS 2018Code Available0· sign in to hype

Alice Lai, Joel Tetreault

Code Available — Be the first to reproduce this paper.

Code

github.com/aylai/GCDC-corpus
OfficialIn papernone★ 0

Abstract

To date there has been very little work on assessing discourse coherence methods on real-world data. To address this, we present a new corpus of real-world texts (GCDC) as well as the first large-scale evaluation of leading discourse coherence algorithms. We show that neural models, including two that we introduce here (SentAvg and ParSeq), tend to perform best. We analyze these performance differences and discuss patterns we observed in low coherence texts in four domains.

Tasks

Coherence Evaluation

Discourse Coherence in the Wild: A Dataset, Evaluation and Methods

Code

Abstract

Tasks

Reproductions