JUNLP at SemEval-2020 Task 9: Sentiment Analysis of Hindi-English Code Mixed Data Using Grid Search Cross Validation

2020-12-01SEMEVALUnverified0· sign in to hype

Avishek Garain, Sainik Mahata, Dipankar Das

Unverified — Be the first to reproduce this paper.

Abstract

Code-mixing is a phenomenon which arises mainly in multilingual societies. Multilingual people, who are well versed in their native languages and also English speakers, tend to code-mix using English-based phonetic typing and the insertion of anglicisms in their main language. This linguistic phenomenon poses a great challenge to conventional NLP domains such as Sentiment Analysis, Machine Translation, and Text Summarization, to name a few. In this work, we focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis. This work was done as participation in the SemEval-2020 Sentimix Task, where we focused on the sentiment analysis of English-Hindi code-mixed sentences. our username for the submission was ``sainik.mahata'' and team name was ``JUNLP''. We used feature extraction algorithms in conjunction with traditional machine learning algorithms such as SVR and Grid Search in an attempt to solve the task. Our approach garnered an f1-score of 66.2\% when tested using metrics prepared by the organizers of the task.

Tasks

Machine Translation Sentiment Analysis Text Summarization Translation

JUNLP at SemEval-2020 Task 9: Sentiment Analysis of Hindi-English Code Mixed Data Using Grid Search Cross Validation

Abstract

Tasks

Reproductions