SOTAVerified

Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

2016-12-01COLING 2016Unverified0· sign in to hype

Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of ``Linguistic Rapid Response'' to potential emergency humanitarian relief situations. In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. We show promising results on both a four-month exercise in Sorani and a two-day exercise in Tajik, achieved with minimal annotation costs.

Tasks

Reproductions