SOTAVerified

Enhancing The RATP-DECODA Corpus With Linguistic Annotations For Performing A Large Range Of NLP Tasks

2016-05-01LREC 2016Unverified0· sign in to hype

Carole Lailler, L, Ana{\"\i}s eau, Fr{\'e}d{\'e}ric B{\'e}chet, Yannick Est{\`e}ve, Paul Del{\'e}glise

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this article, we present the RATP-DECODA Corpus which is composed by a set of 67 hours of speech from telephone conversations of a Customer Care Service (CCS). This corpus is already available on line at http://sldr.org/sldr000847/fr in its first version. However, many enhancements have been made in order to allow the development of automatic techniques to transcript conversations and to capture their meaning. These enhancements fall into two categories: firstly, we have increased the size of the corpus with manual transcriptions from a new operational day; secondly we have added new linguistic annotations to the whole corpus (either manually or through an automatic processing) in order to perform various linguistic tasks from syntactic and semantic parsing to dialog act tagging and dialog summarization.

Tasks

Reproductions