Hate Speech Detection

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 507 papers

Title	Date	Tasks	Status	Score
Leveraging Multilingual Transformers for Hate Speech Detection	Jan 8, 2021	feature selectionGeneral Classification	CodeCode Available	5
Exploring Hate Speech Detection in Multimodal Publications	Oct 9, 2019	Hate Speech Detection	CodeCode Available	5
Exploring Hate Speech Detection with HateXplain and BERT	Aug 9, 2022	Hate Speech Detection	CodeCode Available	5
Automatic Detection of Sexist Statements Commonly Used at the Workplace	Jul 8, 2020	Hate Speech DetectionSentiment Analysis	CodeCode Available	5
MigrationsKB: A Knowledge Base of Public Attitudes towards Migrations and their Driving Factors	Aug 17, 2021	Entity LinkingHate Speech Detection	CodeCode Available	5
Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection	Apr 8, 2022	Hate Speech Detection	CodeCode Available	5
Automated Hate Speech Detection and the Problem of Offensive Language	Mar 11, 2017	Hate Speech Detection	CodeCode Available	5
DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection	Nov 3, 2020	Hate Speech DetectionTransfer Learning	CodeCode Available	5
HateBERT: Retraining BERT for Abusive Language Detection in English	Oct 23, 2020	Abusive LanguageHate Speech Detection	CodeCode Available	5
DefVerify: Do Hate Speech Models Reflect Their Dataset's Definition?	Oct 21, 2024	Hate Speech Detection	CodeCode Available	5
Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models	Oct 24, 2022	Cross-Lingual TransferHate Speech Detection	CodeCode Available	5
Going Extreme: Comparative Analysis of Hate Speech in Parler and Gab	Jan 27, 2022	Hate Speech DetectionTransfer Learning	CodeCode Available	5
Deep Learning for Hate Speech Detection in Tweets	Jun 1, 2017	16kDeep Learning	CodeCode Available	5
Gender Bias Mitigation for Bangla Classification Tasks	Nov 16, 2024	ClassificationHate Speech Detection	CodeCode Available	5
Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection	Jul 1, 2022	FairnessHate Speech Detection	CodeCode Available	5
GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?	Feb 23, 2024	DiagnosticHate Speech Detection	CodeCode Available	5
A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media	Oct 28, 2019	Hate Speech DetectionLanguage Modelling	CodeCode Available	5
Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter	Feb 27, 2018	Hate Speech Detection	CodeCode Available	5
Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks	Jul 31, 2023	Data AugmentationHate Speech Detection	CodeCode Available	5
PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework	Jun 15, 2023	Hate Speech Detection	CodeCode Available	5
Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi	Oct 25, 2021	Hate Speech DetectionSentence	CodeCode Available	5
Improving Hate Speech Classification with Cross-Taxonomy Dataset Integration	Mar 7, 2025	Hate Speech Detection	CodeCode Available	5
DeepHate: Hate Speech Detection via Multi-Faceted Text Representations	Mar 14, 2021	Hate Speech DetectionWord Embeddings	—Unverified	0
A Turkish Hate Speech Dataset and Detection System	Jun 1, 2022	Binary ClassificationHate Speech Detection	—Unverified	0
Data Expansion using Back Translation and Paraphrasing for Hate Speech Detection	May 25, 2021	Data AugmentationDecoder	—Unverified	0

Show:10 25 50

← PrevPage 8 of 21Next →

All datasets Ethos Binary HateXplain Ethos MultiLabel Waseem et al., 2018 AbusEval Automatic Misogynistic Identification HateMM HatEval OffensEval 2019 ToLD-Br bajer_danish_misogyny DKhate

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BiLSTM + static BE	F1-score	0.8	—	Unverified
2	BERT	F1-score	0.79	—	Unverified
3	BiLSTM+Attention+FT	F1-score	0.77	—	Unverified
4	OPT-175B (few-shot)	F1-score	0.76	—	Unverified
5	CNN+Attention+FT+GV	F1-score	0.74	—	Unverified
6	OPT-175B (one-shot)	F1-score	0.71	—	Unverified
7	OPT-175B (zero-shot)	F1-score	0.67	—	Unverified
8	SVM	F1-score	0.66	—	Unverified
9	Random Forests	F1-score	0.64	—	Unverified
10	Davinci (zero-shot)	F1-score	0.63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRP	AUROC	0.86	—	Unverified
2	BERT-RP	AUROC	0.85	—	Unverified
3	BERT-HateXplain [LIME]	AUROC	0.85	—	Unverified
4	BERT-HateXplain [Attn]	AUROC	0.85	—	Unverified
5	BERT [Attn]	AUROC	0.84	—	Unverified
6	BiRNN-HateXplain [Attn]	AUROC	0.81	—	Unverified
7	BiRNN-Attn [Attn]	AUROC	0.8	—	Unverified
8	CNN-GRU [LIME]	AUROC	0.79	—	Unverified
9	BiRNN [LIME]	AUROC	0.77	—	Unverified
10	XG-HSI-BERT	Accuracy	0.75	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLARAM	Hamming Loss	0.29	—	Unverified
2	MLkNN	Hamming Loss	0.16	—	Unverified
3	Binary Relevance	Hamming Loss	0.14	—	Unverified
4	Neural Classifier Chains	Hamming Loss	0.13	—	Unverified
5	Neural Binary Relevance	Hamming Loss	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mozafari et al., 2019	AAA	50.94	—	Unverified
2	SVM	AAA	46.51	—	Unverified
3	Kennedy et al., 2020	AAA	45.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.74	—	Unverified
2	BERT	Macro F1	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	mBert	Accuracy	0.83	—	Unverified
2	Logistic Regression	Accuracy	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HXP + CLAP + CLIP	TEST F1 (macro)	0.85	—	Unverified
2	BERT + ViT + MFCC	TEST F1 (macro)	0.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.49	—	Unverified
2	BERT	Macro F1	0.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.81	—	Unverified
2	BERT	Macro F1	0.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Multilingual BERT	F1-score	0.75	—	Unverified
2	AutoML	F1-score	0.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AOM mBERT	F1	0.85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	F1	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-large-ST	Macro F1	80.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline BERT (task A)	F1	0.77	—	Unverified