Hate Speech Detection

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 507 papers

Title	Date	Tasks	Status
Hate Speech Detection via Dual Contrastive Learning	Jul 10, 2023	Contrastive LearningHate Speech Detection	—Unverified
HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments	Dec 20, 2023	Hate Speech DetectionLanguage Modeling	—Unverified
Hierarchical CVAE for Fine-Grained Hate Speech Classification	Aug 31, 2018	Binary ClassificationClassification	—Unverified
HS-BAN: A Benchmark Dataset of Social Media Comments for Hate Speech Detection in Bangla	Dec 3, 2021	Hate Speech Detection	—Unverified
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets	Oct 10, 2024	Hate Speech Detection	—Unverified
iCompass at Arabic Hate Speech 2022: Detect Hate Speech Using QRNN and Transformers	Jun 1, 2022	Hate Speech Detection	—Unverified
Identification of Multiword Expressions in Tweets for Hate Speech Detection	Jun 1, 2022	Hate Speech Detection	—Unverified
Identifying False Content and Hate Speech in Sinhala YouTube Videos by Analyzing the Audio	Jan 30, 2024	Hate Speech DetectionMisinformation	—Unverified
Identifying Hate Speech Using Neural Networks and Discourse Analysis Techniques	Jun 1, 2022	Hate Speech Detection	—Unverified
Identity Construction in a Misogynist Incels Forum	Jun 27, 2023	Hate Speech Detection	—Unverified
IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages	Dec 23, 2024	Binary ClassificationDiversity	—Unverified
Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification	Apr 12, 2022	Domain AdaptationHate Speech Detection	CodeCode Available
Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter	Nov 1, 2016	Hate Speech Detection	CodeCode Available
Calibrated Learning to Defer with One-vs-All Classifiers	Feb 8, 2022	AllHate Speech Detection	CodeCode Available
EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter	Apr 28, 2024	Abusive LanguageCross-Lingual Transfer	CodeCode Available
Emoji-Based Transfer Learning for Sentiment Tasks	Feb 12, 2021	Hate Speech DetectionSentiment Analysis	CodeCode Available
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification	Mar 27, 2023	ClassificationHate Speech Detection	CodeCode Available
Empirical Study of Text Augmentation on Social Media Text in Vietnamese	Sep 25, 2020	Data AugmentationGeneral Classification	CodeCode Available
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection	Mar 23, 2022	Hate Speech Detection	CodeCode Available
Detecting Online Hate Speech Using Context Aware Models	Oct 20, 2017	Hate Speech Detection	CodeCode Available
TuPy-E: detecting hate speech in Brazilian Portuguese social media with a novel dataset and comprehensive analysis of models	Dec 29, 2023	Hate Speech Detection	CodeCode Available
DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection	Nov 3, 2020	Hate Speech DetectionTransfer Learning	CodeCode Available
DefVerify: Do Hate Speech Models Reflect Their Dataset's Definition?	Oct 21, 2024	Hate Speech Detection	CodeCode Available
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes	May 27, 2022	Fairnessgraph construction	CodeCode Available
A Group-Specific Approach to NLP for Hate Speech Detection	Apr 21, 2023	Common Sense ReasoningEthics	CodeCode Available
Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection	Jun 12, 2024	Contrastive LearningHate Speech Detection	CodeCode Available
Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data	Jun 6, 2023	Hate Speech DetectionNatural Language Inference	CodeCode Available
Evaluation of Hate Speech Detection Using Large Language Models and Geographical Contextualization	Feb 26, 2025	Adversarial RobustnessBinary Classification	CodeCode Available
Examining a hate speech corpus for hate speech detection and popularity prediction	May 12, 2018	Hate Speech Detection	CodeCode Available
Towards a Robust Framework for Multimodal Hate Detection: A Study on Video vs. Image-based Content	Feb 11, 2025	Hate Speech DetectionVideo Classification	CodeCode Available
PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework	Jun 15, 2023	Hate Speech Detection	CodeCode Available
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models	May 4, 2025	Hate Speech Detection	CodeCode Available
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection	May 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available
ViTHSD: Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts	Apr 30, 2024	Hate Speech DetectionLanguage Modelling	CodeCode Available
Battling Hateful Content in Indic Languages HASOC '21	Oct 25, 2021	Hate Speech Detection	CodeCode Available
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study	Nov 16, 2023	Hate Speech Detection	CodeCode Available
Exploring Hate Speech Detection in Multimodal Publications	Oct 9, 2019	Hate Speech Detection	CodeCode Available
Exploring Hate Speech Detection with HateXplain and BERT	Aug 9, 2022	Hate Speech Detection	CodeCode Available
An Online Multilingual Hate speech Recognition System	Nov 23, 2020	Hate Speech Detectionspeech-recognition	CodeCode Available
Automatic Textual Normalization for Hate Speech Detection	Nov 12, 2023	Hate Speech DetectionLexical Normalization	CodeCode Available
Placing M-Phasis on the Plurality of Hate: A Feature-Based Corpus of Hate Online	Apr 28, 2022	ArticlesHate Speech Detection	CodeCode Available
Power of Explanations: Towards automatic debiasing in hate speech detection	Sep 7, 2022	FairnessHate Speech Detection	CodeCode Available
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales	Apr 3, 2024	Contrastive LearningHate Speech Detection	CodeCode Available
Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models	Mar 17, 2024	Computational EfficiencyHate Speech Detection	CodeCode Available
A Benchmark Study of Contrastive Learning for Arabic Social Meaning	Oct 22, 2022	Contrastive LearningDialect Identification	CodeCode Available
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors	Apr 2, 2024	Data PoisoningHate Speech Detection	CodeCode Available
Deep Learning for Hate Speech Detection in Tweets	Jun 1, 2017	16kDeep Learning	CodeCode Available
Towards Efficient and Explainable Hate Speech Detection via Model Distillation	Dec 18, 2024	Abusive LanguageHate Speech Detection	CodeCode Available
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection	Feb 3, 2024	BenchmarkingHate Speech Detection	CodeCode Available
Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection	Jul 1, 2022	FairnessHate Speech Detection	CodeCode Available

Show:10 25 50

← PrevPage 8 of 11Next →

All datasets Ethos Binary HateXplain Ethos MultiLabel Waseem et al., 2018 AbusEval Automatic Misogynistic Identification HateMM HatEval OffensEval 2019 ToLD-Br bajer_danish_misogyny DKhate

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BiLSTM + static BE	F1-score	0.8	—	Unverified
2	BERT	F1-score	0.79	—	Unverified
3	BiLSTM+Attention+FT	F1-score	0.77	—	Unverified
4	OPT-175B (few-shot)	F1-score	0.76	—	Unverified
5	CNN+Attention+FT+GV	F1-score	0.74	—	Unverified
6	OPT-175B (one-shot)	F1-score	0.71	—	Unverified
7	OPT-175B (zero-shot)	F1-score	0.67	—	Unverified
8	SVM	F1-score	0.66	—	Unverified
9	Random Forests	F1-score	0.64	—	Unverified
10	Davinci (zero-shot)	F1-score	0.63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRP	AUROC	0.86	—	Unverified
2	BERT-RP	AUROC	0.85	—	Unverified
3	BERT-HateXplain [LIME]	AUROC	0.85	—	Unverified
4	BERT-HateXplain [Attn]	AUROC	0.85	—	Unverified
5	BERT [Attn]	AUROC	0.84	—	Unverified
6	BiRNN-HateXplain [Attn]	AUROC	0.81	—	Unverified
7	BiRNN-Attn [Attn]	AUROC	0.8	—	Unverified
8	CNN-GRU [LIME]	AUROC	0.79	—	Unverified
9	BiRNN [LIME]	AUROC	0.77	—	Unverified
10	XG-HSI-BERT	Accuracy	0.75	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLARAM	Hamming Loss	0.29	—	Unverified
2	MLkNN	Hamming Loss	0.16	—	Unverified
3	Binary Relevance	Hamming Loss	0.14	—	Unverified
4	Neural Classifier Chains	Hamming Loss	0.13	—	Unverified
5	Neural Binary Relevance	Hamming Loss	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mozafari et al., 2019	AAA	50.94	—	Unverified
2	SVM	AAA	46.51	—	Unverified
3	Kennedy et al., 2020	AAA	45.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.74	—	Unverified
2	BERT	Macro F1	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	mBert	Accuracy	0.83	—	Unverified
2	Logistic Regression	Accuracy	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HXP + CLAP + CLIP	TEST F1 (macro)	0.85	—	Unverified
2	BERT + ViT + MFCC	TEST F1 (macro)	0.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.49	—	Unverified
2	BERT	Macro F1	0.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.81	—	Unverified
2	BERT	Macro F1	0.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Multilingual BERT	F1-score	0.75	—	Unverified
2	AutoML	F1-score	0.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AOM mBERT	F1	0.85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	F1	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-large-ST	Macro F1	80.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline BERT (task A)	F1	0.77	—	Unverified