ANCOR\_Centre, a large free spoken French coreference corpus: description of the resource and reliability measures
Judith Muzerelle, Ana{\"\i}s Lefeuvre, Emmanuel Schang, Jean-Yves Antoine, Aurore Pelletier, Denis Maurel, Iris Eshkol, Jeanne Villaneau
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This article presents ANCOR\_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR\_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource.