Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

2025-02-18Code Available1· sign in to hype

Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne

Code Available — Be the first to reproduce this paper.

Code

github.com/JingbiaoMei/RGCL
Officialpytorch★ 34

Abstract

Hateful memes have become a significant concern on the Internet, necessitating robust automated detection systems. While LMMs have shown promise in hateful meme detection, they face notable challenges like sub-optimal performance and limited out-of-domain generalization capabilities. Recent studies further reveal the limitations of both SFT and in-context learning when applied to LMMs in this setting. To address these issues, we propose a robust adaptation framework for hateful meme detection that enhances in-domain accuracy and cross-domain generalization while preserving the general vision-language capabilities of LMMs. Experiments on six meme classification datasets show that our approach achieves state-of-the-art performance, outperforming larger agentic systems. Moreover, our method generates higher-quality rationales for explaining hateful content compared to standard SFT, enhancing model interpretability.

Tasks

Contrastive Learning Domain Generalization Hateful Meme Classification In-Context Learning Meme Classification Retrieval

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Hateful Memes	RA-HMD (Qwen2-VL-7B)	ROC-AUC	0.91	—	Unverified
Hateful Memes	RA-HMD (LLaVA-1.5-7B)	ROC-AUC	0.9	—	Unverified
Hateful Memes	RA-HMD (Qwen2-VL-2B)	ROC-AUC	0.88	—	Unverified
MultiOFF	RA-HMD (Qwen2-VL-7B)	Accuracy	71.1	—	Unverified

Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

Code

Abstract

Tasks

Benchmark Results

Reproductions