On the Optimal Weighted _2 Regularization in Overparameterized Linear Regression
Denny Wu, Ji Xu
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We consider the linear model y = X _ + with X R^n p in the overparameterized regime p>n. We estimate _ via generalized (weighted) ridge regression: _ = (X^TX + _w)^ X^Ty, where _w is the weighting matrix. Under a random design setting with general data covariance _x and anisotropic prior on the true coefficients E__^T = _, we provide an exact characterization of the prediction risk E(y-x^T_)^2 in the proportional asymptotic limit p/n (1,). Our general setup leads to a number of interesting findings. We outline precise conditions that decide the sign of the optimal setting _ opt for the ridge parameter and confirm the implicit _2 regularization effect of overparameterization, which theoretically justifies the surprising empirical observation that _ opt can be negative in the overparameterized regime. We also characterize the double descent phenomenon for principal component regression (PCR) when both X and _ are anisotropic. Finally, we determine the optimal weighting matrix _w for both the ridgeless ( 0) and optimally regularized ( = _ opt) case, and demonstrate the advantage of the weighted objective over standard ridge regression and PCR.