SOTAVerified

Clause-based Discourse Segmentation of Arabic Texts

2012-05-01LREC 2012Unverified0· sign in to hype

Isk Keskes, ar, Farah Benamara, lamia hadrich belguith

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes a rule-based approach to segment Arabic texts into clauses. Our method relies on an extensive analysis of a large set of lexical cues as well as punctuation marks. Our analysis was carried out on two different corpus genres: news articles and elementary school textbooks. We propose a three steps segmentation algorithm: first by using only punctuation marks, then by relying only on lexical cues and finally by using both typology and lexical cues. The results were compared with manual segmentations elaborated by experts.

Tasks

Reproductions