GJG@TamilNLP-ACL2022: Using Transformers for Abusive Comment Classification in Tamil

2022-05-01DravidianLangTech (ACL) 2022Unverified0· sign in to hype

Gaurang Prasad, Janvi Prasad, Gunavathi C

Unverified — Be the first to reproduce this paper.

Abstract

This paper presents transformer-based models for the “Abusive Comment Detection” shared task at the Second Workshop on Speech and Language Technologies for Dravidian Languages at ACL 2022. Our team participated in both the multi-class classification sub-tasks as a part of this shared task. The dataset for sub-task A was in Tamil text; while B was code-mixed Tamil-English text. Both the datasets contained 8 classes of abusive comments. We trained an XLM-RoBERTa and DeBERTA base model on the training splits for each sub-task. For sub-task A, the XLM-RoBERTa model achieved an accuracy of 0.66 and the DeBERTa model achieved an accuracy of 0.62. For sub-task B, both the models achieved a classification accuracy of 0.72; however, the DeBERTa model performed better in other classification metrics. Our team ranked 2nd in the code-mixed classification sub-task and 8th in Tamil-text sub-task.

Tasks

Classification Multi-class Classification

GJG@TamilNLP-ACL2022: Using Transformers for Abusive Comment Classification in Tamil

Abstract

Tasks

Reproductions