SOTAVerified

Don’t Forget About Pronouns: Removing Gender Bias in Language Models without Losing Factual Gender Information

2022-01-16ACL ARR January 2022Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The representations in large language models contain various types of gender information. We focus on two types of such signals in English texts: factual gender information, which is a grammatical or semantic property, and gender bias, which is the correlation between a word and specific gender. We can disentangle the model’s embeddings and identify components encoding both information with probing. We aim to diminish the representation of stereotypical bias while preserving factual gender signal. Our filtering method shows that it is possible to decrease the bias of gender-neutral profession names without deteriorating language modeling capabilities. The findings can be applied to language generation and understanding to mitigate reliance on stereotypes while preserving gender agreement in coreferences.

Tasks

Reproductions