Sign Stable Random Projections for Large-Scale Learning
Ping Li
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the use of "sign -stable random projections" (where 0< 2) for building basic data processing tools in the context of large-scale machine learning applications (e.g., classification, regression, clustering, and near-neighbor search). After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinear kernels depending on the value of . Thus, this approach provides an effective strategy for approximating nonlinear learning algorithms essentially at the cost of linear learning. When =2, it is known that the corresponding nonlinear kernel is the arc-cosine kernel. When =1, the procedure approximates the arc-cos-^2 kernel (under certain condition). When 0+, it corresponds to the resemblance kernel. From practitioners' perspective, the method of sign -stable random projections is ready to be tested for large-scale learning applications, where can be simply viewed as a tuning parameter. What is missing in the literature is an extensive empirical study to show the effectiveness of sign stable random projections, especially for 2 or 1. The paper supplies such a study on a wide variety of classification datasets. In particular, we compare shoulder-by-shoulder sign stable random projections with the recently proposed "0-bit consistent weighted sampling (CWS)" (Li 2015).