Creation of Corpus and Analysis in Code-Mixed Kannada-English Social Media Data for POS Tagging

2020-12-01ICON 2020Unverified0· sign in to hype

Abhinav Reddy Appidi, Vamshi Krishna Srirangam, Darsi Suhas, Manish Shrivastava

Unverified — Be the first to reproduce this paper.

Abstract

Part-of-Speech (POS) is one of the essential tasks for many Natural Language Processing (NLP) applications. There has been a significant amount of work done in POS tagging for resource-rich languages. POS tagging is an essential phase of text analysis in understanding the semantics and context of language. These tags are useful for higher-level tasks such as building parse trees, which can be used for Named Entity Recognition, Coreference resolution, Sentiment Analysis, and Question Answering. There has been work done on code-mixed social media corpus but not on POS tagging of Kannada-English code-mixed data. Here, we present Kannada-English code- mixed social media corpus annotated with corresponding POS tags. We also experimented with machine learning classification models CRF, Bi-LSTM, and Bi-LSTM-CRF models on our corpus.

Tasks

coreference-resolution Coreference Resolution named-entity-recognition Named Entity Recognition Named Entity Recognition (NER)POS POS Tagging Question Answering Sentiment Analysis

Creation of Corpus and Analysis in Code-Mixed Kannada-English Social Media Data for POS Tagging

Abstract

Tasks

Reproductions