SOTAVerified

Compound Type Identification in Sanskrit: What Roles do the Corpus and Grammar Play?

2016-12-01WS 2016Unverified0· sign in to hype

Amrith Krishna, Pavankumar Satuluri, Shubham Sharma, Apurv Kumar, Pawan Goyal

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We propose a classification framework for semantic type identification of compounds in Sanskrit. We broadly classify the compounds into four different classes namely, Avyay\=\=ava, Tatpuruṣa, Bahuvr\= and Dvandva. Our classification is based on the traditional classification system followed by the ancient grammar treatise Adṣṭ\=adhy\=ay\= , proposed by P\=aṇini 25 centuries back. We construct an elaborate features space for our system by combining conditional rules from the grammar Adṣṭ\=adhy\=ay\= , semantic relations between the compound components from a lexical database Amarakoṣa and linguistic structures from the data using Adaptor Grammars. Our in-depth analysis of the feature space highlight inadequacy of Adṣṭ\=adhy\=ay\= , a generative grammar, in classifying the data samples. Our experimental results validate the effectiveness of using lexical databases as suggested by Amba Kulkarni and Anil Kumar, and put forward a new research direction by introducing linguistic patterns obtained from Adaptor grammars for effective identification of compound type. We utilise an ensemble based approach, specifically designed for handling skewed datasets and we \%and Experimenting with various classification methods, we achieve an overall accuracy of 0.77 using random forest classifiers.

Tasks

Reproductions