Variable Selection in Convex Piecewise Linear Regression
Haitham Kanj, Seonho Kim, Kiryung Lee
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression where the model is given as max a_j^, x + b_j^ for j = 1,,k where x R^d is the covariate vector. Here, _j^\_j=1^k and _j^\_j=1^k denote the ground-truth weight vectors and intercepts. A non-asymptotic local convergence analysis is provided for Sp-GD under sub-Gaussian noise when the covariate distribution satisfies sub-Gaussianity and anti-concentration property. When the model order and parameters are fixed, Sp-GD provides an -accurate estimate given O((^-2_z^2,1)s(d/s)) observations where _z^2 denotes the noise variance. This also implies the exact parameter recovery by Sp-GD from O(s(d/s)) noise-free observations. Since optimizing the squared loss for sparse max-affine is non-convex, an initialization scheme is proposed to provide a suitable initial estimate within the basin of attraction for Sp-GD, i.e. sufficiently accurate to invoke the convergence guarantees. The initialization scheme uses sparse principal component analysis to estimate the subspace spanned by \ a_j^\_j=1^k then applies an r-covering search to estimate the model parameters. A non-asymptotic analysis is presented for this initialization scheme when the covariates and noise samples follow Gaussian distributions. When the model order and parameters are fixed, this initialization scheme provides an -accurate estimate given O(^-2(_z^4,_z^2,1)s^2^4(d)) observations. Numerical Monte Carlo results corroborate theoretical findings for Sp-GD and the initialization scheme.