To compress or not to compress? A Finite-State approach to Nen verbal morphology
Saliha Muradoglu, Nicholas Evans, Hanna Suominen
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper describes the development of a verbal morphological parser for an under-resourced Papuan language, Nen. Nen verbal morphology is particularly complex, with a transitive verb taking up to 1,740 unique features. The structural properties exhibited by Nen verbs raises interesting choices for analysis. Here we compare two possible methods of analysis: `Chunking' and decomposition. `Chunking' refers to the concept of collating morphological segments into one, whereas the decomposition model follows a more classical linguistic approach. Both models are built using the Finite-State Transducer toolkit foma. The resultant architecture shows differences in size and structural clarity. While the `Chunking' model is under half the size of the full de-composed counterpart, the decomposition displays higher structural order. In this paper, we describe the challenges encountered when modelling a language exhibiting distributed exponence and present the first morphological analyser for Nen, with an overall accuracy of 80.3\%.