HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

2024-08-11Unverified0· sign in to hype

Xuanyu Su, Yansong Li, Diana Inkpen, Nathalie Japkowicz

Unverified — Be the first to reproduce this paper.

Abstract

Amidst the rise of Large Multimodal Models (LMMs) and their widespread application in generating and interpreting complex content, the risk of propagating biased and harmful memes remains significant. Current safety measures often fail to detect subtly integrated hateful content within ``Confounder Memes''. To address this, we introduce HateSieve, a new framework designed to enhance the detection and segmentation of hateful elements in memes. HateSieve features a novel Contrastive Meme Generator that creates semantically paired memes, a customized triplet dataset for contrastive learning, and an Image-Text Alignment module that produces context-aware embeddings for accurate meme segmentation. Empirical experiments on the Hateful Meme Dataset show that HateSieve not only surpasses existing LMMs in performance with fewer trainable parameters but also offers a robust mechanism for precisely identifying and isolating hateful content. redCaution: Contains academic discussions of hate speech; viewer discretion advised.

Tasks

Contrastive Learning Triplet

HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

Abstract

Tasks

Reproductions