SOTAVerified

Bias Detection

Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)

Papers

Showing 1120 of 199 papers

TitleStatusHype
Debiased Visual Question Answering from Feature and Sample PerspectivesCode1
Counterfactual Token Generation in Large Language ModelsCode1
Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the CloudCode1
Learning to Split for Automatic Bias DetectionCode1
Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset CollectionCode1
Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for HallucinationsCode1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsCode1
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness MetricsCode1
BiasAsker: Measuring the Bias in Conversational AI SystemCode1
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By ExpertsCode1
Show:102550
← PrevPage 2 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-2 (small)ICAT Score72.97Unverified
2XLNet (large)ICAT Score72.03Unverified
3GPT-2 (medium)ICAT Score71.73Unverified
4BERT (base)ICAT Score71.21Unverified
5GPT-2 (large)ICAT Score70.54Unverified
6BERT (large)ICAT Score69.89Unverified
7RoBERTa (base)ICAT Score67.5Unverified
8GAL 120BICAT Score65.6Unverified
9XLNet (base)ICAT Score62.1Unverified
10GPT-3 (text-davinci-002)ICAT Score60.8Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Best-of0.5Unverified
2BaselineBest-of0.41Unverified
3GemmaBest-of0.41Unverified
4MistralBest-of0.36Unverified
5Llama2Best-of0.34Unverified
#ModelMetricClaimedVerifiedStatus
1BADICAT Score23.44Unverified
#ModelMetricClaimedVerifiedStatus
1RandomForest_default_hyperparametersAccuracy (%)49Unverified
#ModelMetricClaimedVerifiedStatus
1RoBERTa+ALBERTF170.4Unverified