Hate Speech Detection

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 507 papers

Title	Date	Tasks	Status
Multimodal and Explainable Internet Meme Classification	Dec 11, 2022	ClassificationExplainable Models	—Unverified
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks	May 11, 2023	Hate Speech Detection	—Unverified
Multi-modal Hate Speech Detection using Machine Learning	Jun 15, 2023	Hate Speech Detection	—Unverified
Multitask Learning for Arabic Offensive Language and Hate-Speech Detection	May 1, 2020	Hate Speech DetectionTransfer Learning	—Unverified
A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection	Jun 1, 2018	General ClassificationHate Speech Detection	—Unverified
My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks	Jun 24, 2023	BenchmarkingHate Speech Detection	—Unverified
A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs	Mar 30, 2024	Data AugmentationHate Speech Detection	—Unverified
Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection	Dec 14, 2024	Hate Speech Detection	—Unverified
ToxSyn-PT: A Large-Scale Synthetic Dataset for Hate Speech Detection in Portuguese	Jun 11, 2025	Hate Speech DetectionMulti-Label Classification	—Unverified
Transferring Knowledge via Neighborhood-Aware Optimal Transport for Low-Resource Hate Speech Detection	Oct 17, 2022	Hate Speech Detection	—Unverified
Aggression Detection in Social Media: Using Deep Neural Networks, Data Augmentation, and Pseudo Labeling	Aug 1, 2018	Data AugmentationFeature Engineering	—Unverified
Whose Emotions and Moral Sentiments Do Language Models Reflect?	Feb 16, 2024	Hate Speech Detection	—Unverified
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models	Mar 18, 2023	Adversarial AttackBenchmarking	—Unverified
Trustworthy Hate Speech Detection Through Visual Augmentation	Sep 20, 2024	Hate Speech Detection	—Unverified
TuEval at SemEval-2019 Task 5: LSTM Approach to Hate Speech Detection in English and Spanish	Jun 1, 2019	Hate Speech Detection	—Unverified
Offensive Language and Hate Speech Detection for Danish	Aug 13, 2019	Hate Speech Detection	—Unverified
Offensive Language and Hate Speech Detection with Deep Learning and Transfer Learning	Aug 6, 2021	Data AugmentationHate Speech Detection	—Unverified
On a Benefit of Mask Language Modeling: Robustness to Simplicity Bias	Oct 11, 2021	Hate Speech DetectionLanguage Modeling	—Unverified
One to rule them all: Towards Joint Indic Language Hate Speech Detection	Sep 28, 2021	AllHate Speech Detection	—Unverified
On Fairness of Task Arithmetic: The Role of Task Vectors	May 30, 2025	FairnessHate Speech Detection	—Unverified
On Importance of Code-Mixed Embeddings for Hate Speech Identification	Nov 27, 2024	Hate Speech DetectionSentence	—Unverified
On Limitations of LLM as Annotator for Low Resource Languages	Nov 26, 2024	Hate Speech DetectionNews Classification	—Unverified
Online Hate: Behavioural Dynamics and Relationship with Misinformation	May 28, 2021	Hate Speech DetectionMisinformation	—Unverified
On the Challenges of Building Datasets for Hate Speech Detection	Sep 6, 2023	Hate Speech Detection	—Unverified
A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task	Apr 7, 2020	Decision MakingHate Speech Detection	—Unverified
A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities	Dec 6, 2024	Federated LearningFew-Shot Learning	—Unverified
OSACT4 Shared Task on Offensive Language Detection: Intensive Preprocessing-Based Approach	May 14, 2020	ClassificationDimensionality Reduction	—Unverified
OSACT4 Shared Tasks: Ensembled Stacked Classification for Offensive and Hate Speech in Arabic Tweets	May 1, 2020	General ClassificationHate Speech Detection	—Unverified
BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques	Nov 22, 2024	Hate Speech DetectionKnowledge Distillation	—Unverified
BanTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla	Oct 17, 2024	ClassificationHate Speech Detection	—Unverified
Overview of OSACT5 Shared Task on Arabic Offensive Language and Hate Speech Detection	Jun 1, 2022	Hate Speech Detection	—Unverified
Bayesian Methods for Semi-supervised Text Annotation	Oct 28, 2020	Deep LearningHate Speech Detection	—Unverified
Author Profiling for Hate Speech Detection	Feb 14, 2019	16kAuthor Profiling	—Unverified
Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages	Aug 12, 2021	Binary ClassificationClassification	—Unverified
BERT-based Ensemble Approaches for Hate Speech Detection	Sep 14, 2022	Hate Speech DetectionMulti-Label Classification	—Unverified
BERT or FastText? A Comparative Analysis of Contextual as well as Non-Contextual Embeddings	Nov 26, 2024	Hate Speech DetectionNews Classification	—Unverified
Beyond Explanation: A Case for Exploratory Text Visualizations of Non-Aggregated, Annotated Datasets	Jun 1, 2022	Bias DetectionHate Speech Detection	—Unverified
Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety	Mar 27, 2023	Binary ClassificationClassification	—Unverified
BOISHOMMO: Holistic Approach for Bangla Hate Speech	Apr 11, 2025	Hate Speech Detection	—Unverified
Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies	Sep 12, 2023	Cross-Lingual TransferHate Speech Detection	—Unverified
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks	Oct 25, 2023	Hate Speech Detection	—Unverified
Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning	Oct 8, 2024	Few-Shot LearningHate Speech Detection	—Unverified
Bridging the gap in online hate speech detection: a comparative analysis of BERT and traditional models for homophobic content identification on X/Twitter	May 15, 2024	Hate Speech DetectionSentiment Analysis	—Unverified
A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information	Nov 11, 2024	Abusive LanguageHate Speech Detection	—Unverified
ABARUAH at SemEval-2019 Task 5 : Bi-directional LSTM for Hate Speech Detection	Jun 1, 2019	Hate Speech Detection	—Unverified
Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study	May 9, 2025	DiversityHate Speech Detection	—Unverified
Afaan Oromo Hate Speech Detection and Classification on Social Media	Jun 1, 2022	ClassificationHate Speech Detection	—Unverified
Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language	Oct 18, 2021	Hate Speech DetectionLanguage Identification	—Unverified
Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages	Sep 6, 2024	Hate Speech DetectionSentiment Analysis	—Unverified
A Comparative Study of Different State-of-the-Art Hate Speech Detection Methods in Hindi-English Code-Mixed Data	May 1, 2020	Hate Speech Detection	—Unverified

Show:10 25 50

← PrevPage 8 of 11Next →

All datasets Ethos Binary HateXplain Ethos MultiLabel Waseem et al., 2018 AbusEval Automatic Misogynistic Identification HateMM HatEval OffensEval 2019 ToLD-Br bajer_danish_misogyny DKhate

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BiLSTM + static BE	F1-score	0.8	—	Unverified
2	BERT	F1-score	0.79	—	Unverified
3	BiLSTM+Attention+FT	F1-score	0.77	—	Unverified
4	OPT-175B (few-shot)	F1-score	0.76	—	Unverified
5	CNN+Attention+FT+GV	F1-score	0.74	—	Unverified
6	OPT-175B (one-shot)	F1-score	0.71	—	Unverified
7	OPT-175B (zero-shot)	F1-score	0.67	—	Unverified
8	SVM	F1-score	0.66	—	Unverified
9	Random Forests	F1-score	0.64	—	Unverified
10	Davinci (zero-shot)	F1-score	0.63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRP	AUROC	0.86	—	Unverified
2	BERT-RP	AUROC	0.85	—	Unverified
3	BERT-HateXplain [LIME]	AUROC	0.85	—	Unverified
4	BERT-HateXplain [Attn]	AUROC	0.85	—	Unverified
5	BERT [Attn]	AUROC	0.84	—	Unverified
6	BiRNN-HateXplain [Attn]	AUROC	0.81	—	Unverified
7	BiRNN-Attn [Attn]	AUROC	0.8	—	Unverified
8	CNN-GRU [LIME]	AUROC	0.79	—	Unverified
9	BiRNN [LIME]	AUROC	0.77	—	Unverified
10	XG-HSI-BERT	Accuracy	0.75	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLARAM	Hamming Loss	0.29	—	Unverified
2	MLkNN	Hamming Loss	0.16	—	Unverified
3	Binary Relevance	Hamming Loss	0.14	—	Unverified
4	Neural Classifier Chains	Hamming Loss	0.13	—	Unverified
5	Neural Binary Relevance	Hamming Loss	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mozafari et al., 2019	AAA	50.94	—	Unverified
2	SVM	AAA	46.51	—	Unverified
3	Kennedy et al., 2020	AAA	45.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.74	—	Unverified
2	BERT	Macro F1	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	mBert	Accuracy	0.83	—	Unverified
2	Logistic Regression	Accuracy	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HXP + CLAP + CLIP	TEST F1 (macro)	0.85	—	Unverified
2	BERT + ViT + MFCC	TEST F1 (macro)	0.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.49	—	Unverified
2	BERT	Macro F1	0.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.81	—	Unverified
2	BERT	Macro F1	0.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Multilingual BERT	F1-score	0.75	—	Unverified
2	AutoML	F1-score	0.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AOM mBERT	F1	0.85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	F1	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-large-ST	Macro F1	80.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline BERT (task A)	F1	0.77	—	Unverified