SOTAVerified

TimeBankPT: A TimeML Annotated Corpus of Portuguese

2012-05-01LREC 2012Unverified0· sign in to hype

Francisco Costa, Ant{\'o}nio Branco

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we introduce TimeBankPT, a TimeML annotated corpus of Portuguese. It has been produced by adapting an existing resource for English, namely the data used in the first TempEval challenge. TimeBankPT is the first corpus of Portuguese with rich temporal annotations (i.e. it includes annotations not only of temporal expressions but also about events and temporal relations). In addition, it was subjected to an automated error mining procedure that checks the consistency of the annotated temporal relations based on their logical properties. This procedure allowed for the detection of some errors in the annotations, that also affect the original English corpus. The Portuguese language is currently undergoing a spelling reform, and several countries where Portuguese is official are in a transitional period where old and new orthographies are valid. TimeBankPT adopts the recent spelling reform. This decision is to preserve its future usefulness. TimeBankPT is freely available for download.

Tasks

Reproductions