Exploring Multi-Level Threats in Telegram Data with AI-Human Annotation: A Preliminary Study

2023-12-152023 22nd IEEE International Conference on Machine Learning and Applications (ICMLA) 2023Unverified0· sign in to hype

Kamalakkannan Ravi, Adan Ernesto Vela, Elizabeth Jenaway, Steven Windisch

arXiv PDF

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This research addresses the crucial challenge of effectively measuring threats in social media comments targeting voting, public officials, and institutions in the United States. Our understanding of these online threats and their links to real-world risks is limited, making it difficult to assess their seriousness. To overcome these limitations, we propose a comprehensive threat level scale from 0 to 5 and collect a dataset of 1.3 million Telegram responses for developing and rigorously testing these threat levels. Additionally, we explore OpenAI-human annotation to efficiently label this vast dataset. Our innovative two-step transfer learning approach initially employs a pre-existing, pre-trained model for labeling, followed by expert validation. Next, we use the AI-annotated samples to develop independent models, and expert annotators verify their predictions. Notably, our findings demonstrate that the GPT-2 model, despite its fewer annotated training set, performs comparably to OpenAI's anno-tations, showcasing its potential for cost-effective threat detection with more annotated samples. With the long-term objective of establishing continuous threat-level monitoring, we identify the strengths and limitations of our current approach and propose a roadmap for enhancing threat detection.

Tasks

Information Retrieval Text Classification Transfer Learning Violence and Weaponized Violence Detection

Exploring Multi-Level Threats in Telegram Data with AI-Human Annotation: A Preliminary Study

Abstract

Tasks

Reproductions