SOTAVerified

Precog-LTRC-IIITH at GermEval 2021: Ensembling Pre-Trained Language Models with Feature Engineering

2021-09-01GermEval 2021Code Available0· sign in to hype

T. H. Arjun, Arvindh A., Kumaraguru Ponnurangam

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We describe our participation in all the subtasks of the Germeval 2021 shared task on the identification of Toxic, Engaging, and Fact-Claiming Comments. Our system is an ensemble of state-of-the-art pre-trained models finetuned with carefully engineered features. We show that feature engineering and data augmentation can be helpful when the training data is sparse. We achieve an F1 score of 66.87, 68.93, and 73.91 in Toxic, Engaging, and Fact-Claiming comment identification subtasks.

Tasks

Reproductions