A Comprehensive Framework for Semantic Similarity Analysis of Human and AI-Generated Text Using Transformer Architectures and Ensemble Techniques

2025-01-24Unverified0· sign in to hype

Lifu Gao, Ziwei Liu, Qi Zhang

Unverified — Be the first to reproduce this paper.

Abstract

The rapid advancement of large language models (LLMs) has made detecting AI-generated text an increasingly critical challenge. Traditional methods often fail to capture the nuanced semantic differences between human and machine-generated content. We therefore propose a novel approach based on semantic similarity analysis, leveraging a multi-layered architecture that combines a pre-trained DeBERTa-v3-large model, Bi-directional LSTMs, and linear attention pooling to capture both local and global semantic patterns. To enhance performance, we employ advanced input and output augmentation techniques such as sector-level context integration and wide output configurations. These techniques enable the model to learn more discriminative features and generalize across diverse domains. Experimental results show that this approach works better than traditional methods, proving its usefulness for AI-generated text detection and other text comparison tasks.

Tasks

Data Augmentation Domain Adaptation Mamba Semantic Similarity Semantic Textual Similarity Spelling Correction text-classification Text Classification Text Detection

A Comprehensive Framework for Semantic Similarity Analysis of Human and AI-Generated Text Using Transformer Architectures and Ensemble Techniques

Abstract

Tasks

Reproductions