SOTAVerified

Bias Detection

Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)

Papers

Showing 110 of 199 papers

TitleStatusHype
Galactica: A Large Language Model for ScienceCode4
Explainable AI in Spatial AnalysisCode2
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like BiasesCode1
BiasAsker: Measuring the Bias in Conversational AI SystemCode1
Debiased Visual Question Answering from Feature and Sample PerspectivesCode1
Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the CloudCode1
BAD: BiAs Detection for Large Language Models in the context of candidate screeningCode1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsCode1
Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for HallucinationsCode1
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness MetricsCode1
Show:102550
← PrevPage 1 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-2 (small)ICAT Score72.97Unverified
2XLNet (large)ICAT Score72.03Unverified
3GPT-2 (medium)ICAT Score71.73Unverified
4BERT (base)ICAT Score71.21Unverified
5GPT-2 (large)ICAT Score70.54Unverified
6BERT (large)ICAT Score69.89Unverified
7RoBERTa (base)ICAT Score67.5Unverified
8GAL 120BICAT Score65.6Unverified
9XLNet (base)ICAT Score62.1Unverified
10GPT-3 (text-davinci-002)ICAT Score60.8Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Best-of0.5Unverified
2GemmaBest-of0.41Unverified
3BaselineBest-of0.41Unverified
4MistralBest-of0.36Unverified
5Llama2Best-of0.34Unverified
#ModelMetricClaimedVerifiedStatus
1BADICAT Score23.44Unverified
#ModelMetricClaimedVerifiedStatus
1RandomForest_default_hyperparametersAccuracy (%)49Unverified
#ModelMetricClaimedVerifiedStatus
1RoBERTa+ALBERTF170.4Unverified