SOTAVerified

Bias Detection

Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)

Papers

Showing 150 of 199 papers

TitleStatusHype
How Neural Networks Organize Concepts: Introducing Concept Trajectory Analysis for Deep Learning InterpretabilityCode0
Cascading Adversarial Bias from Injection to Distillation in Language Models0
Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions0
Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector0
BiasLab: Toward Explainable Political Bias Detection with Dual-Axis Annotations and Rationale Indicators0
To Bias or Not to Bias: Detecting bias in News with bias-detectorCode0
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAPCode0
Efficient Fairness Testing in Large Language Models: Prioritizing Metamorphic Relations for Bias Detection0
Explainable AI in Spatial AnalysisCode2
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models0
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models0
Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles0
STOOD-X methodology: using statistical nonparametric test for OOD Detection Large-Scale datasets enhanced with explainability0
On the Mutual Influence of Gender and Occupation in LLM Representations0
Fine-Grained Bias Detection in LLM: Enhancing detection mechanisms for nuanced biases0
Cognitive Bias Detection Using Advanced Prompt Engineering0
Visual Reasoning Evaluation of Grok, Deepseek Janus, Gemini, Qwen, Mistral, and ChatGPT0
Robust Bias Detection in MLMs and its Application to Human Trait RatingsCode0
Detecting Linguistic Bias in Government Documents Using Large language Models0
Towards Equitable AI: Detecting Bias in Using Large Language Models for Marketing0
BiaSWE: An Expert Annotated Dataset for Misogyny Detection in Swedish0
FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing0
LLMs can be easily Confused by Instructional Distractions0
Sample Complexity of Bias Detection with Subsampled Point-to-Subspace Distances0
Bias Detection via Maximum Subgroup Discrepancy0
Auditing a Dutch Public Sector Risk Profiling Algorithm Using an Unsupervised Bias Detection Tool0
Unmasking Conversational Bias in AI Multiagent Systems0
Decoding News Bias: Multi Bias Detection in News Articles0
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers0
ViLBias: A Comprehensive Framework for Bias Detection through Linguistic and Visual Cues , presenting Annotation Strategies, Evaluation, and Key ChallengesCode0
Improved Models for Media Bias Detection and Subcategorization0
MT-LENS: An all-in-one Toolkit for Better Machine Translation EvaluationCode1
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation0
MediaSpin: Exploring Media Bias Through Fine-Grained Analysis of News Headlines0
Bias Analysis of AI Models for Undergraduate Student Admissions0
The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias DetectionCode0
Bias in Large Language Models: Origin, Evaluation, and Mitigation0
Mitigating Bias in Queer Representation within Large Language Models: A Collaborative Agent ApproachCode0
Current State-of-the-Art of Bias Detection and Mitigation in Machine Translation for African and European Languages: a Review0
Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play0
Can We Trust AI Agents? A Case Study of an LLM-Based Multi-Agent System for Ethical AI0
debiaSAE: Benchmarking and Mitigating Vision-Language Model BiasCode0
With a Grain of SALT: Are LLMs Fair Across Social Dimensions?0
GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and StereotypesCode0
TinyEmo: Scaling down Emotional Reasoning via Metric ProjectionCode0
Mitigating the Risk of Health Inequity Exacerbated by Large Language Models0
Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM InteractionsCode0
Counterfactual Token Generation in Large Language ModelsCode1
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation0
Explainable AI for computational pathology identifies model limitations and tissue biomarkersCode1
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-2 (small)ICAT Score72.97Unverified
2XLNet (large)ICAT Score72.03Unverified
3GPT-2 (medium)ICAT Score71.73Unverified
4BERT (base)ICAT Score71.21Unverified
5GPT-2 (large)ICAT Score70.54Unverified
6BERT (large)ICAT Score69.89Unverified
7RoBERTa (base)ICAT Score67.5Unverified
8GAL 120BICAT Score65.6Unverified
9XLNet (base)ICAT Score62.1Unverified
10GPT-3 (text-davinci-002)ICAT Score60.8Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Best-of0.5Unverified
2GemmaBest-of0.41Unverified
3BaselineBest-of0.41Unverified
4MistralBest-of0.36Unverified
5Llama2Best-of0.34Unverified
#ModelMetricClaimedVerifiedStatus
1BADICAT Score23.44Unverified
#ModelMetricClaimedVerifiedStatus
1RandomForest_default_hyperparametersAccuracy (%)49Unverified
#ModelMetricClaimedVerifiedStatus
1RoBERTa+ALBERTF170.4Unverified