BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation

2025-05-19Code Available1· sign in to hype

Haiquan Wen, Yiwei He, Zhenglin Huang, Tianxiao Li, Zihan Yu, Xingru Huang, Lu Qi, Baoyuan Wu, Xiangtai Li, Guangliang Cheng

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/l8cv/busterx
OfficialIn papernone★ 37

Abstract

Advances in AI generative models facilitate super-realistic video synthesis, amplifying misinformation risks via social media and eroding trust in digital content. Several research works have explored new deepfake detection methods on AI-generated images to alleviate these risks. However, with the fast development of video generation models, such as Sora and WanX, there is currently a lack of large-scale, high-quality AI-generated video datasets for forgery detection. In addition, existing detection approaches predominantly treat the task as binary classification, lacking explainability in model decision-making and failing to provide actionable insights or guidance for the public. To address these challenges, we propose GenBuster-200K, a large-scale AI-generated video dataset featuring 200K high-resolution video clips, diverse latest generative techniques, and real-world scenes. We further introduce BusterX, a novel AI-generated video detection and explanation framework leveraging multimodal large language model (MLLM) and reinforcement learning for authenticity determination and explainable rationale. To our knowledge, GenBuster-200K is the first large-scale, high-quality AI-generated video dataset that incorporates the latest generative techniques for real-world scenarios. BusterX is the first framework to integrate MLLM with reinforcement learning for explainable AI-generated video detection. Extensive comparisons with state-of-the-art methods and ablation studies validate the effectiveness and generalizability of BusterX. The code, models, and datasets will be released.

Tasks

Binary Classification DeepFake Detection Face Swapping Large Language Model Misinformation Multimodal Large Language Model Video Generation

BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation

Code

Abstract

Tasks

Reproductions