The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, Davide Testuggine
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/facebookresearch/mmfOfficialIn paperpytorch★ 5,624
- github.com/rizavelioglu/hateful_memes-hate_detectronnone★ 61
- github.com/holman57/Hateful-Memespytorch★ 2
- github.com/SebKleiner/Hateful_Memesnone★ 0
Abstract
This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes. It is constructed such that unimodal models struggle and only multimodal models can succeed: difficult examples ("benign confounders") are added to the dataset to make it hard to rely on unimodal signals. The task requires subtle reasoning, yet is straightforward to evaluate as a binary classification problem. We provide baseline performance numbers for unimodal models, as well as for multimodal models with various degrees of sophistication. We find that state-of-the-art methods perform poorly compared to humans (64.73% vs. 84.7% accuracy), illustrating the difficulty of the task and highlighting the challenge that this important problem poses to the community.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Hateful Memes | Human | ROC-AUC | 0.83 | — | Unverified |
| Hateful Memes | Visual BERT COCO | ROC-AUC | 0.75 | — | Unverified |