SOTAVerified

Implementation of Naïve Bayes and Gini Index for Spam Email Classification

2021-05-03Computational and Simulation Vol. 6 No. 1 (2021): April, 2020 2021Code Available0· sign in to hype

Fikri Rozan Imadudin, Danang Triantoro Murdiansyah, Adiwijaya

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Email is a medium of information that is still frequently used by people today. At the moment email still has an endless problem that is spam email. Spam email is an email that can pollute, damage or disturb the recipient. In this study, we show the performance and accuracy of Multinomial Naïve Bayes (MNNB) and Complete Gini-Index Text (GIT) for use in spam email filtering. In this study, we used 6 cross-validations as testers for the built classification machines. We found that the average yield can exceed Multinomial Naïve Bayes without using feature selection which only uses 80000 features with a difference of 0.39%. Feature selection also increases speed during classification and can reduce features that are less relevant to the category to be classified.

Tasks

Reproductions