SOTAVerified

Bias Detection

Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)

Papers

Showing 151199 of 199 papers

TitleStatusHype
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models0
Bias in Large Language Models: Origin, Evaluation, and Mitigation0
Bias in word embeddings0
BiasLab: Toward Explainable Political Bias Detection with Dual-Axis Annotations and Rationale Indicators0
BiasScanner: Automatic Detection and Classification of News Bias to Strengthen Democracy0
BiaSWE: An Expert Annotated Dataset for Misogyny Detection in Swedish0
Personalized Detection of Cognitive Biases in Actions of Users from Their Logs: Anchoring and Recency Biases0
Towards Equitable AI: Detecting Bias in Using Large Language Models for Marketing0
Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions0
Can We Trust AI Agents? A Case Study of an LLM-Based Multi-Agent System for Ethical AI0
Cascading Adversarial Bias from Injection to Distillation in Language Models0
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report0
ChatGPT v.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models0
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers0
Cognitive Bias Detection Using Advanced Prompt Engineering0
Constructive Interpretability with CoLabel: Corroborative Integration, Complementary Features, and Collaborative Learning0
Pseudo-labelling Enhanced Media Bias Detection0
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation0
BENN: Bias Estimation Using Deep Neural Network0
Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media0
Current State-of-the-Art of Bias Detection and Mitigation in Machine Translation for African and European Languages: a Review0
Current Topological and Machine Learning Applications for Bias Detection in Text0
BEADs: Bias Evaluation Across Domains0
Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs0
Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models0
Decoding News Bias: Multi Bias Detection in News Articles0
Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection0
Deep Learning for Bias Detection: From Inception to Deployment0
Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play0
Designing Tools for Semi-Automated Detection of Machine Learning Biases: An Interview Study0
Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema0
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT0
Detecting Linguistic Bias in Government Documents Using Large language Models0
The Impact of Unstated Norms in Bias Analysis of Language Models0
Detecting Political Bias in News Articles Using Headline Attention0
Robots Enact Malignant Stereotypes0
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks0
Unsupervised Bias Detection in College Student Newspapers0
Sample Complexity of Bias Detection with Subsampled Point-to-Subspace Distances0
DocNet: Semantic Structure in Inductive Bias Detection Models0
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation0
Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms0
Efficient Fairness Testing in Large Language Models: Prioritizing Metamorphic Relations for Bias Detection0
Efficient Gender Debiasing of Pre-trained Indic Language Models0
Enhancing Bias Detection in Political News Using Pragmatic Presupposition0
Mitigating the Risk of Health Inequity Exacerbated by Large Language Models0
Auditing Predictive Models for Intersectional Biases0
Epistemological Bias As a Means for the Automated Detection of Injustices in Text0
Evaluating AI fairness in credit scoring with the BRIO tool0
Show:102550
← PrevPage 4 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-2 (small)ICAT Score72.97Unverified
2XLNet (large)ICAT Score72.03Unverified
3GPT-2 (medium)ICAT Score71.73Unverified
4BERT (base)ICAT Score71.21Unverified
5GPT-2 (large)ICAT Score70.54Unverified
6BERT (large)ICAT Score69.89Unverified
7RoBERTa (base)ICAT Score67.5Unverified
8GAL 120BICAT Score65.6Unverified
9XLNet (base)ICAT Score62.1Unverified
10GPT-3 (text-davinci-002)ICAT Score60.8Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Best-of0.5Unverified
2GemmaBest-of0.41Unverified
3BaselineBest-of0.41Unverified
4MistralBest-of0.36Unverified
5Llama2Best-of0.34Unverified
#ModelMetricClaimedVerifiedStatus
1BADICAT Score23.44Unverified
#ModelMetricClaimedVerifiedStatus
1RandomForest_default_hyperparametersAccuracy (%)49Unverified
#ModelMetricClaimedVerifiedStatus
1RoBERTa+ALBERTF170.4Unverified