Information structure in the Potsdam Commentary Corpus: Topics
2016-05-01LREC 2016Unverified0· sign in to hype
Manfred Stede, Sara Mamprin
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The Potsdam Commentary Corpus is a collection of 175 German newspaper commentaries annotated on a variety of different layers. This paper introduces a new layer that covers the linguistic notion of information-structural topic (not to be confused with `topic' as applied to documents in information retrieval). To our knowledge, this is the first larger topic-annotated resource for German (and one of the first for any language). We describe the annotation guidelines and the annotation process, and the results of an inter-annotator agreement study, which compare favourably to the related work. The annotated corpus is freely available for research.