Simple vs Oversampling-based Classification Methods for Fine Grained Arabic Dialect Identification in Twitter

2020-12-01COLING (WANLP) 2020Unverified0· sign in to hype

Mohamed Lichouri, Mourad Abbas

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we present a description of our experiments on country-level Arabic dialect identification. A comparison study between a set of classifiers has been carried out. The best results were achieved using the Linear Support Vector Classification (LSVC) model by applying a Random Over Sampling (ROS) process yielding an F1-score of 18.74% in the post-evaluation phase.In the evaluation phase, our best submitted system has achieved an F1-score of 18.27%, very close to the average F1-score (18.80%) obtained for all the submitted systems.

Tasks

Dialect Identification

Simple vs Oversampling-based Classification Methods for Fine Grained Arabic Dialect Identification in Twitter

Abstract

Tasks

Reproductions