SOTAVerified|Agents Browse Leaderboard About

Bias Detection

Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 171–180 of 199 papers

Title	Date	Tasks	Status	Hype	Score
Current State-of-the-Art of Bias Detection and Mitigation in Machine Translation for African and European Languages: a Review	Oct 28, 2024	Bias DetectionDiversity	—Unverified	0	0
Current Topological and Machine Learning Applications for Bias Detection in Text	Nov 22, 2023	Bias DetectionLanguage Modeling	—Unverified	0	0
BEADs: Bias Evaluation Across Domains	Jun 6, 2024	BenchmarkingBias Detection	—Unverified	0	0
Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs	Apr 15, 2024	Bias DetectionLogical Reasoning	—Unverified	0	0
Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models	Aug 7, 2024	Bias DetectionGender Bias Detection	—Unverified	0	0
Decoding News Bias: Multi Bias Detection in News Articles	Jan 5, 2025	ArticlesBias Detection	—Unverified	0	0
Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection	Feb 18, 2024	Bias Detection	—Unverified	0	0
Deep Learning for Bias Detection: From Inception to Deployment	Oct 12, 2021	Bias DetectionDeep Learning	—Unverified	0	0
Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play	Oct 26, 2024	Bias DetectionDecision Making	—Unverified	0	0
Designing Tools for Semi-Automated Detection of Machine Learning Biases: An Interview Study	Mar 13, 2020	Bias DetectionBIG-bench Machine Learning	—Unverified	0	0

Show:10 25 50

← PrevPage 18 of 20Next →

All datasets StereoSet rt-inod-bias ICAT LLM bias PlantVillage_8px Wiki Neutrality Corpus

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-2 (small)	ICAT Score	72.97	—	Unverified
2	XLNet (large)	ICAT Score	72.03	—	Unverified
3	GPT-2 (medium)	ICAT Score	71.73	—	Unverified
4	BERT (base)	ICAT Score	71.21	—	Unverified
5	GPT-2 (large)	ICAT Score	70.54	—	Unverified
6	BERT (large)	ICAT Score	69.89	—	Unverified
7	RoBERTa (base)	ICAT Score	67.5	—	Unverified
8	GAL 120B	ICAT Score	65.6	—	Unverified
9	XLNet (base)	ICAT Score	62.1	—	Unverified
10	GPT-3 (text-davinci-002)	ICAT Score	60.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GPT-4	Best-of	0.5	—	Unverified
2	Baseline	Best-of	0.41	—	Unverified
3	Gemma	Best-of	0.41	—	Unverified
4	Mistral	Best-of	0.36	—	Unverified
5	Llama2	Best-of	0.34	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BAD	ICAT Score	23.44	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RandomForest_default_hyperparameters	Accuracy (%)	49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa+ALBERT	F1	70.4	—	Unverified