Closed-form _r norm scaling with data for overparameterized linear regression and diagonal linear networks under _p bias

2026-03-19Unverified0· sign in to hype

Shuofeng Zhang, Ard Louis

Unverified — Be the first to reproduce this paper.

Abstract

For overparameterized linear regression with isotropic Gaussian design and minimum-_p interpolator p(1,2], we give a unified, high-probability characterization for the scaling of the family of parameter norms \\ w_p _r \\_r [1,p] with sample size. We solve this basic, but unresolved question through a simple dual-ray analysis, which reveals a competition between a signal *spike* and a *bulk* of null coordinates in X^ Y, yielding closed-form predictions for (i) a data-dependent transition n_ (the "elbow"), and (ii) a universal threshold r_=2(p-1) that separates w_p _r's which plateau from those that continue to grow with an explicit exponent. This unified solution resolves the scaling of *all* _r norms within the family r [1,p] under _p-biased interpolation, and explains in one picture which norms saturate and which increase as n grows. We then study diagonal linear networks (DLNs) trained by gradient descent. By calibrating the initialization scale α to an effective p_eff(α) via the DLN separable potential, we show empirically that DLNs inherit the same elbow/threshold laws, providing a predictive bridge between explicit and implicit bias. Given that many generalization proxies depend on w_p _r, our results suggest that their predictive power will depend sensitively on which l_r norm is used.

Closed-form _r norm scaling with data for overparameterized linear regression and diagonal linear networks under _p bias

Abstract

Reproductions