SPADE: Structured Prompting Augmentation for Dialogue Enhancement in Machine-Generated Text Detection

2025-07-01Code Available0· sign in to hype

Haoyi Li, Angela Yifei Yuan, Soyeon Caren Han, Christopher Leckie

Code Available — Be the first to reproduce this paper.

Code

github.com/angieyyf/spade-customer-service-dialogue
OfficialIn paper★ 2

Abstract

The increasing capability of large language models (LLMs) to generate synthetic content has heightened concerns about their misuse, driving the development of Machine-Generated Text (MGT) detection models. However, these detectors face significant challenges due to the lack of high-quality synthetic datasets for training. To address this issue, we propose SPADE, a structured framework for detecting synthetic dialogues using prompt-based positive and negative samples. Our proposed methods yield 14 new dialogue datasets, which we benchmark against eight MGT detection models. The results demonstrate improved generalization performance when utilizing a mixed dataset produced by proposed augmentation frameworks, offering a practical approach to enhancing LLM application security. Considering that real-world agents lack knowledge of future opponent utterances, we simulate online dialogue detection and examine the relationship between chat history length and detection accuracy. Our open-source datasets, code and prompts can be downloaded from https://github.com/AngieYYF/SPADE-customer-service-dialogue.

SPADE: Structured Prompting Augmentation for Dialogue Enhancement in Machine-Generated Text Detection

Code

Abstract

Reproductions