SOTAVerified

Resolving Pronouns for a Resource-Poor Language, Malayalam Using Resource-Rich Language, Tamil.

2019-09-01RANLP 2019Unverified0· sign in to hype

Sobha Lalitha Devi

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper we give in detail how a resource rich language can be used for resolving pronouns for a less resource language. The source language, which is resource rich language in this study, is Tamil and the resource poor language is Malayalam, both belonging to the same language family, Dravidian. The Pronominal resolution developed for Tamil uses CRFs. Our approach is to leverage the Tamil language model to test Malayalam data and the processing required for Malayalam data is detailed. The similarity at the syntactic level between the languages is exploited in identifying the features for developing the Tamil language model. The word form or the lexical item is not considered as a feature for training the CRFs. Evaluation on Malayalam Wikipedia data shows that our approach is correct and the results, though not as good as Tamil, but comparable.

Tasks

Reproductions