SOTAVerified

Classifying Verses of the Quran using Doc2vec

2021-12-01ICON 2021Unverified0· sign in to hype

Menwa Alshammeri, Eric Atwell, Mohammad Alsalka

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The Quran, as a significant religious text, bears important spiritual and linguistic values. Understanding the text and inferring the underlying meanings entails semantic similarity analysis. We classified the verses of the Quran into 15 pre-defined categories or concepts, based on the Qurany corpus, using Doc2Vec and Logistic Regression. Our classifier scored 70% accuracy, and 60% F1-score using the distributed bag-of-words architecture. We then measured how similar the documents within the same category are to each other semantically and use this information to evaluate our model. We calculated the mean difference and average similarity values for each category to indicate how well our model describes that category.

Tasks

Reproductions