Simple vs Oversampling-based Classification Methods for Fine Grained Arabic Dialect Identification in Twitter
2020-12-01COLING (WANLP) 2020Unverified0· sign in to hype
Mohamed Lichouri, Mourad Abbas
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
In this paper, we present a description of our experiments on country-level Arabic dialect identification. A comparison study between a set of classifiers has been carried out. The best results were achieved using the Linear Support Vector Classification (LSVC) model by applying a Random Over Sampling (ROS) process yielding an F1-score of 18.74% in the post-evaluation phase.In the evaluation phase, our best submitted system has achieved an F1-score of 18.27%, very close to the average F1-score (18.80%) obtained for all the submitted systems.