SOTAVerified

Pearson Distance is not a Distance

2019-08-15Unverified0· sign in to hype

Victor Solo

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The Pearson distance between a pair of random variables X,Y with correlation _xy, namely, 1-_xy, has gained widespread use, particularly for clustering, in areas such as gene expression analysis, brain imaging and cyber security. In all these applications it is implicitly assumed/required that the distance measures be metrics, thus satisfying the triangle inequality. We show however, that Pearson distance is not a metric. We go on to show that this can be repaired by recalling the result, (well known in other literature) that 1-_xy is a metric. We similarly show that a related measure of interest, 1-|_xy|, which is invariant to the sign of _xy, is not a metric but that 1-_xy^2 is. We also give generalizations of these results.

Tasks

Reproductions