SOTAVerified

Finite-state morphological transducers for three Kypchak languages

2014-05-01LREC 2014Unverified0· sign in to hype

Jonathan Washington, Ilnar Salimzyanov, Francis Tyers

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes the development of free/open-source finite-state morphological transducers for three Turkic languages―Kazakh, Tatar, and Kumyk―representing one language from each of the three sub-branches of the Kypchak branch of Turkic. The finite-state toolkit used for the work is the Helsinki Finite-State Toolkit (HFST). This paper describes how the development of a transducer for each subsequent closely-related language took less development time. An evaluation is presented which shows that the transducers all have a reasonable coverage―around 90 \%―on freely available corpora of the languages, and high precision over a manually verified test set.

Tasks

Reproductions