SOTAVerified

Statistical Inference in Classification of High-dimensional Gaussian Mixture

2024-10-25Unverified0· sign in to hype

Hanwen Huang, Peng Zeng

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We consider the classification problem of a high-dimensional mixture of two Gaussians with general covariance matrices. Using the replica method from statistical physics, we investigate the asymptotic behavior of a general class of regularized convex classifiers in the high-dimensional limit, where both the sample size n and the dimension p approach infinity while their ratio =n/p remains fixed. Our focus is on the generalization error and variable selection properties of the estimators. Specifically, based on the distributional limit of the classifier, we construct a de-biased estimator to perform variable selection through an appropriate hypothesis testing procedure. Using L_1-regularized logistic regression as an example, we conducted extensive computational experiments to confirm that our analytical findings are consistent with numerical simulations in finite-sized systems. We also explore the influence of the covariance structure on the performance of the de-biased estimator.

Tasks

Reproductions