SOTAVerified

An algorithm for controlled text analysis on Wikipedia

2020-05-28Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

While numerous work has examined bias on Wikipedia, most approaches fail to control for possible confounding variables. In this work, given a target corpus for analysis (e.g. biography pages about women), we present a method for constructing a control corpus that matches the target corpus in as many attributes as possible, except the target attribute (e.g. the gender of the subject). This methodology can be used to analyze specific types of bias in Wikipedia articles, for example, gender or racial bias, while minimizing the influence of confounding variables.

Tasks

Reproductions