Classifying the Ideological Orientation of User-Submitted Texts in Social Media

2022-12-12IEEE International Conference on Machine Learning and Applications (ICMLA) 2022Code Available0· sign in to hype

Kamalakkannan Ravi, Adan Ernesto Vela, Rickard Ewetz

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/ADCLab/RedditIdeologyDB
Officialnone★ 1

Abstract

With the long-term goal of understanding how language is used and evolves within online communities, this work explores the application of natural language processing techniques to classify text articles according to their ideological orientation (i.e., conservative or liberal). We first collect a balanced corpus of text articles posted to the online communities r/Liberal and r/Conservative from the social media website Reddit. Using the corpus, we develop and apply three classifiers. The baseline classifier is a Bayes model that accounts for each text article’s web domain, as such, classification is independent of content. Next, we develop a support vector machine (SVM) model with term frequency-inverse document frequency (TF-IDF) features; this approach highlight differences in language using a count-based feature-space to differentiate text articles. Last, we evaluate the context-based transformer (RoBERTa) model and discuss its under-performance relative to the baseline and SVM models.

Tasks

Articles Classification News Classification

Classifying the Ideological Orientation of User-Submitted Texts in Social Media

Code

Abstract

Tasks

Reproductions