Hate Speech Detection

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 507 papers

Title	Date	Tasks	Status
Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks	Jul 31, 2023	Data AugmentationHate Speech Detection	CodeCode Available
ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features	Jul 25, 2023	ARCDeep Learning	—Unverified
Wisdom of Instruction-Tuned Language Model Crowds. Exploring Model Label Variation	Jul 24, 2023	Few-Shot LearningHate Speech Detection	—Unverified
Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection	Jul 24, 2023	Contrastive LearningDeep Learning	CodeCode Available
HateModerate: Testing Hate Speech Detectors against Content Moderation Policies	Jul 23, 2023	Hate Speech Detection	CodeCode Available
Mitigating Label Bias via Decoupled Confident Learning	Jul 18, 2023	FairnessHate Speech Detection	—Unverified
Hate Speech Detection via Dual Contrastive Learning	Jul 10, 2023	Contrastive LearningHate Speech Detection	—Unverified
Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation	Jul 4, 2023	Hate Speech Detection	—Unverified
Identity Construction in a Misogynist Incels Forum	Jun 27, 2023	Hate Speech Detection	—Unverified
Uncovering Political Hate Speech During Indian Election Campaign: A New Low-Resource Dataset and Baselines	Jun 26, 2023	Hate Speech Detection	CodeCode Available
The Art of Embedding Fusion: Optimizing Hate Speech Detection	Jun 26, 2023	Hate Speech Detection	CodeCode Available
My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks	Jun 24, 2023	BenchmarkingHate Speech Detection	—Unverified
PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework	Jun 15, 2023	Hate Speech Detection	CodeCode Available
Multi-modal Hate Speech Detection using Machine Learning	Jun 15, 2023	Hate Speech Detection	—Unverified
Leveraging Language Identification to Enhance Code-Mixed Text Classification	Jun 8, 2023	ClassificationHate Speech Detection	—Unverified
Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data	Jun 6, 2023	Hate Speech DetectionNatural Language Inference	CodeCode Available
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili	Jun 1, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment	Jun 1, 2023	BenchmarkingHate Speech Detection	CodeCode Available
Explaining Hate Speech Classification with Model Agnostic Methods	May 30, 2023	ClassificationHate Speech Detection	—Unverified
Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models	May 29, 2023	Adversarial RobustnessDecision Making	—Unverified
Towards Legally Enforceable Hate Speech Detection for Public Forums	May 23, 2023	Hate Speech Detection	CodeCode Available
Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection	May 22, 2023	Hate Speech Detection	—Unverified
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks	May 11, 2023	Hate Speech Detection	—Unverified
Antisemitic Messages? A Guide to High-Quality Annotation and a Labeled Dataset of Tweets	Apr 28, 2023	Hate Speech Detection	—Unverified
A Group-Specific Approach to NLP for Hate Speech Detection	Apr 21, 2023	Common Sense ReasoningEthics	CodeCode Available
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks	Apr 4, 2023	Few-Shot LearningHate Speech Detection	—Unverified
LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification	Apr 3, 2023	ClassificationHate Speech Detection	—Unverified
Hate Speech Targets Detection in Parler using BERT	Apr 3, 2023	Hate Speech Detection	CodeCode Available
The Other Side of Compression: Measuring Bias in Pruned Transformers	Apr 1, 2023	Hate Speech DetectionNetwork Pruning	CodeCode Available
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification	Mar 27, 2023	ClassificationHate Speech Detection	CodeCode Available
Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety	Mar 27, 2023	Binary ClassificationClassification	—Unverified
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models	Mar 18, 2023	Adversarial AttackBenchmarking	—Unverified
Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic languages	Mar 17, 2023	Hate Speech Detection	CodeCode Available
Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection	Mar 4, 2023	Cross-Lingual TransferDomain Generalization	—Unverified
Hate Speech and Offensive Language Detection using an Emotion-aware Shared Encoder	Feb 17, 2023	Hate Speech Detection	—Unverified
Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content	Jan 25, 2023	Hate Speech Detection	—Unverified
Leveraging World Knowledge in Implicit Hate Speech Detection	Dec 28, 2022	Entity LinkingHate Speech Detection	—Unverified
AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection	Dec 20, 2022	Hate Speech Detection	—Unverified
Multimodal and Explainable Internet Meme Classification	Dec 11, 2022	ClassificationExplainable Models	—Unverified
A Graph-Based Context-Aware Model to Understand Online Conversations	Nov 16, 2022	Hate Speech DetectionMisinformation	—Unverified
Hope Speech Detection on Social Media Platforms	Nov 14, 2022	Hate Speech DetectionHope Speech Detection	CodeCode Available
How Much Hate with #china? A Preliminary Analysis on China-related Hateful Tweets Two Years After the Covid Pandemic Began	Nov 11, 2022	Hate Speech Detection	CodeCode Available
Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models	Oct 24, 2022	Cross-Lingual TransferHate Speech Detection	CodeCode Available
A Benchmark Study of Contrastive Learning for Arabic Social Meaning	Oct 22, 2022	Contrastive LearningDialect Identification	CodeCode Available
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages	Oct 20, 2022	Hate Speech Detection	CodeCode Available
Transferring Knowledge via Neighborhood-Aware Optimal Transport for Low-Resource Hate Speech Detection	Oct 17, 2022	Hate Speech Detection	—Unverified
T5 for Hate Speech, Augmented Data and Ensemble	Oct 11, 2022	Data AugmentationExplainable artificial intelligence	CodeCode Available
Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection	Oct 9, 2022	Hate Speech Detection	—Unverified
Hate Speech and Offensive Language Detection in Bengali	Oct 7, 2022	Hate Speech Detection	CodeCode Available
Hypothesis Engineering for Zero-Shot Hate Speech Detection	Oct 3, 2022	Hate Speech DetectionNatural Language Inference	CodeCode Available

Show:10 25 50

← PrevPage 5 of 11Next →

All datasets Ethos Binary HateXplain Ethos MultiLabel Waseem et al., 2018 AbusEval Automatic Misogynistic Identification HateMM HatEval OffensEval 2019 ToLD-Br bajer_danish_misogyny DKhate

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BiLSTM + static BE	F1-score	0.8	—	Unverified
2	BERT	F1-score	0.79	—	Unverified
3	BiLSTM+Attention+FT	F1-score	0.77	—	Unverified
4	OPT-175B (few-shot)	F1-score	0.76	—	Unverified
5	CNN+Attention+FT+GV	F1-score	0.74	—	Unverified
6	OPT-175B (one-shot)	F1-score	0.71	—	Unverified
7	OPT-175B (zero-shot)	F1-score	0.67	—	Unverified
8	SVM	F1-score	0.66	—	Unverified
9	Random Forests	F1-score	0.64	—	Unverified
10	Davinci (zero-shot)	F1-score	0.63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRP	AUROC	0.86	—	Unverified
2	BERT-RP	AUROC	0.85	—	Unverified
3	BERT-HateXplain [LIME]	AUROC	0.85	—	Unverified
4	BERT-HateXplain [Attn]	AUROC	0.85	—	Unverified
5	BERT [Attn]	AUROC	0.84	—	Unverified
6	BiRNN-HateXplain [Attn]	AUROC	0.81	—	Unverified
7	BiRNN-Attn [Attn]	AUROC	0.8	—	Unverified
8	CNN-GRU [LIME]	AUROC	0.79	—	Unverified
9	BiRNN [LIME]	AUROC	0.77	—	Unverified
10	XG-HSI-BERT	Accuracy	0.75	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLARAM	Hamming Loss	0.29	—	Unverified
2	MLkNN	Hamming Loss	0.16	—	Unverified
3	Binary Relevance	Hamming Loss	0.14	—	Unverified
4	Neural Classifier Chains	Hamming Loss	0.13	—	Unverified
5	Neural Binary Relevance	Hamming Loss	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mozafari et al., 2019	AAA	50.94	—	Unverified
2	SVM	AAA	46.51	—	Unverified
3	Kennedy et al., 2020	AAA	45.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.74	—	Unverified
2	BERT	Macro F1	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	mBert	Accuracy	0.83	—	Unverified
2	Logistic Regression	Accuracy	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HXP + CLAP + CLIP	TEST F1 (macro)	0.85	—	Unverified
2	BERT + ViT + MFCC	TEST F1 (macro)	0.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.49	—	Unverified
2	BERT	Macro F1	0.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HateBERT	Macro F1	0.81	—	Unverified
2	BERT	Macro F1	0.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Multilingual BERT	F1-score	0.75	—	Unverified
2	AutoML	F1-score	0.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	AOM mBERT	F1	0.85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	F1	0.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-large-ST	Macro F1	80.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline BERT (task A)	F1	0.77	—	Unverified