SOTAVerified

Resolution of Simpson's paradox via the common cause principle

2024-03-01Code Available0· sign in to hype

A. Hovhannisyan, A. E. Allahverdyan

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Simpson's paradox is an obstacle to establishing a probabilistic association between two events a_1 and a_2, given the third (lurking) random variable B. We focus on scenarios when the random variables A (which combines a_1, a_2, and their complements) and B have a common cause C that need not be observed. Alternatively, we can assume that C screens out A from B. For such cases, the correct association between a_1 and a_2 is to be defined via conditioning over C. This setup generalizes the original Simpson's paradox: now its two contradicting options refer to two particular and different causes C. We show that if B and C are binary and A is quaternary (the minimal and the most widespread situation for the Simpson's paradox), the conditioning over any binary common cause C establishes the same direction of association between a_1 and a_2 as the conditioning over B in the original formulation of the paradox. Thus, for the minimal common cause, one should choose the option of Simpson's paradox that assumes conditioning over B and not its marginalization. The same conclusion is reached when Simpson's paradox is formulated via 3 continuous Gaussian variables: within the minimal formulation of the paradox (3 scalar continuous variables A_1, A_2, and B), one should choose the option with the conditioning over B.

Tasks

Reproductions