UNITE-FND: Reframing Multimodal Fake News Detection through Unimodal Scene Translation

2025-02-16Unverified0· sign in to hype

Arka Mukherjee, Shreya Ghosh

Unverified — Be the first to reproduce this paper.

Abstract

Multimodal fake news detection typically demands complex architectures and substantial computational resources, posing deployment challenges in real-world settings. We introduce UNITE-FND, a novel framework that reframes multimodal fake news detection as a unimodal text classification task. We propose six specialized prompting strategies with Gemini 1.5 Pro, converting visual content into structured textual descriptions, and enabling efficient text-only models to preserve critical visual information. To benchmark our approach, we introduce Uni-Fakeddit-55k, a curated dataset family of 55,000 samples each, each processed through our multimodal-to-unimodal translation framework. Experimental results demonstrate that UNITE-FND achieves 92.52% accuracy in binary classification, surpassing prior multimodal models while reducing computational costs by over 10x (TinyBERT variant: 14.5M parameters vs. 250M+ in SOTA models). Additionally, we propose a comprehensive suite of five novel metrics to evaluate image-to-text conversion quality, ensuring optimal information preservation. Our results demonstrate that structured text-based representations can replace direct multimodal processing with minimal loss of accuracy, making UNITE-FND a practical and scalable alternative for resource-constrained environments.

Tasks

Binary Classification Fake News Detection Image to text text-classification Text Classification

UNITE-FND: Reframing Multimodal Fake News Detection through Unimodal Scene Translation

Abstract

Tasks

Reproductions