Hate Speech Detection

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–375 of 507 papers

Title	Date	Tasks	Status
Hate Speech Detection via Dual Contrastive Learning	Jul 10, 2023	Contrastive LearningHate Speech Detection	—Unverified
HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments	Dec 20, 2023	Hate Speech DetectionLanguage Modeling	—Unverified
Hierarchical CVAE for Fine-Grained Hate Speech Classification	Aug 31, 2018	Binary ClassificationClassification	—Unverified
HS-BAN: A Benchmark Dataset of Social Media Comments for Hate Speech Detection in Bangla	Dec 3, 2021	Hate Speech Detection	—Unverified
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets	Oct 10, 2024	Hate Speech Detection	—Unverified
iCompass at Arabic Hate Speech 2022: Detect Hate Speech Using QRNN and Transformers	Jun 1, 2022	Hate Speech Detection	—Unverified
Identification of Multiword Expressions in Tweets for Hate Speech Detection	Jun 1, 2022	Hate Speech Detection	—Unverified
Identifying False Content and Hate Speech in Sinhala YouTube Videos by Analyzing the Audio	Jan 30, 2024	Hate Speech DetectionMisinformation	—Unverified
Identifying Hate Speech Using Neural Networks and Discourse Analysis Techniques	Jun 1, 2022	Hate Speech Detection	—Unverified
Identity Construction in a Misogynist Incels Forum	Jun 27, 2023	Hate Speech Detection	—Unverified
IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages	Dec 23, 2024	Binary ClassificationDiversity	—Unverified
Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification	Apr 12, 2022	Domain AdaptationHate Speech Detection	CodeCode Available
Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter	Nov 1, 2016	Hate Speech Detection	CodeCode Available
Calibrated Learning to Defer with One-vs-All Classifiers	Feb 8, 2022	AllHate Speech Detection	CodeCode Available
EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter	Apr 28, 2024	Abusive LanguageCross-Lingual Transfer	CodeCode Available
Emoji-Based Transfer Learning for Sentiment Tasks	Feb 12, 2021	Hate Speech DetectionSentiment Analysis	CodeCode Available
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification	Mar 27, 2023	ClassificationHate Speech Detection	CodeCode Available
Empirical Study of Text Augmentation on Social Media Text in Vietnamese	Sep 25, 2020	Data AugmentationGeneral Classification	CodeCode Available
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection	Mar 23, 2022	Hate Speech Detection	CodeCode Available
Detecting Online Hate Speech Using Context Aware Models	Oct 20, 2017	Hate Speech Detection	CodeCode Available
TuPy-E: detecting hate speech in Brazilian Portuguese social media with a novel dataset and comprehensive analysis of models	Dec 29, 2023	Hate Speech Detection	CodeCode Available
DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection	Nov 3, 2020	Hate Speech DetectionTransfer Learning	CodeCode Available
DefVerify: Do Hate Speech Models Reflect Their Dataset's Definition?	Oct 21, 2024	Hate Speech Detection	CodeCode Available
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes	May 27, 2022	Fairnessgraph construction	CodeCode Available
A Group-Specific Approach to NLP for Hate Speech Detection	Apr 21, 2023	Common Sense ReasoningEthics	CodeCode Available

Show:10 25 50

← PrevPage 15 of 21Next →

All datasets Ethos Binary HateXplain Ethos MultiLabel Waseem et al., 2018 AbusEval Automatic Misogynistic Identification HateMM HatEval OffensEval 2019 ToLD-Br bajer_danish_misogyny DKhate

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BiLSTM + static BE	F1-score	0.8	—	Unverified
2	BERT	F1-score	0.79	—	Unverified
3	BiLSTM+Attention+FT	F1-score	0.77	—	Unverified
4	OPT-175B (few-shot)	F1-score	0.76	—	Unverified
5	CNN+Attention+FT+GV	F1-score	0.74	—	Unverified
6	OPT-175B (one-shot)	F1-score	0.71	—	Unverified
7	OPT-175B (zero-shot)	F1-score	0.67	—	Unverified
8	SVM	F1-score	0.66	—	Unverified
9	Random Forests	F1-score	0.64	—	Unverified
10	Davinci (zero-shot)	F1-score	0.63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRP	AUROC	0.86	—	Unverified
2	BERT-RP	AUROC	0.85	—	Unverified
3	BERT-HateXplain [Attn]	AUROC	0.85	—	Unverified
4	BERT-HateXplain [LIME]	AUROC	0.85	—	Unverified
5	BERT [Attn]	AUROC	0.84	—	Unverified
6	BiRNN-HateXplain [Attn]	AUROC	0.81	—	Unverified
7	BiRNN-Attn [Attn]	AUROC	0.8	—	Unverified
8	CNN-GRU [LIME]	AUROC	0.79	—	Unverified
9	BiRNN [LIME]	AUROC	0.77	—	Unverified
10	XG-HSI-BERT	Accuracy	0.75	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLARAM	Hamming Loss	0.29	—	Unverified
2	MLkNN	Hamming Loss	0.16	—	Unverified
3	Binary Relevance	Hamming Loss	0.14	—	Unverified
4	Neural Classifier Chains	Hamming Loss	0.13	—	Unverified
5	Neural Binary Relevance	Hamming Loss	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mozafari et al., 2019	AAA	50.94	—	Unverified
2	SVM	AAA	46.51	—	Unverified
3	Kennedy et al., 2020	AAA	45.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.74	—	Unverified
2	BERT	Macro F1	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	mBert	Accuracy	0.83	—	Unverified
2	Logistic Regression	Accuracy	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HXP + CLAP + CLIP	TEST F1 (macro)	0.85	—	Unverified
2	BERT + ViT + MFCC	TEST F1 (macro)	0.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.49	—	Unverified
2	BERT	Macro F1	0.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.81	—	Unverified
2	BERT	Macro F1	0.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Multilingual BERT	F1-score	0.75	—	Unverified
2	AutoML	F1-score	0.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AOM mBERT	F1	0.85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	F1	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-large-ST	Macro F1	80.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline BERT (task A)	F1	0.77	—	Unverified