SOTAVerified

DiffRed: Dimensionality Reduction guided by stable rank

2024-03-09Code Available0· sign in to hype

Prarabdh Shukla, Gagan Raj Gupta, Kunal Dutta

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first k_1 principal components and the residual matrix A^* (left after subtracting its k_1-rank approximation) along k_2 Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortion of the pairwise distances. We rigorously prove that DiffRed achieves a general upper bound of O(1-pk_2) on Stress and O((1-p)k_2*(A^*)) on M1 where p is the fraction of variance explained by the first k_1 principal components and (A^*) is the stable rank of A^*. These bounds are tighter than the currently known results for Random maps. Our extensive experiments on a variety of real-world datasets demonstrate that DiffRed achieves near zero M1 and much lower values of Stress as compared to the well-known dimensionality reduction techniques. In particular, DiffRed can map a 6 million dimensional dataset to 10 dimensions with 54% lower Stress than PCA.

Tasks

Reproductions