SOTAVerified

Improved error rates for sparse (group) learning with Lipschitz loss functions

2019-10-20Unverified0· sign in to hype

Antoine Dedieu

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study a family of sparse estimators defined as minimizers of some empirical Lipschitz loss function -- which include the hinge loss, the logistic loss and the quantile regression loss -- with a convex, sparse or group-sparse regularization. In particular, we consider the L1 norm on the coefficients, its sorted Slope version, and the Group L1-L2 extension. We propose a new theoretical framework that uses common assumptions in the literature to simultaneously derive new high-dimensional L2 estimation upper bounds for all three regularization schemes. %, and to improve over existing results. For L1 and Slope regularizations, our bounds scale as (k^*/n) (p/k^*) -- n p is the size of the design matrix and k^* the dimension of the theoretical loss minimizer ^* -- and match the optimal minimax rate achieved for the least-squares case. For Group L1-L2 regularization, our bounds scale as (s^*/n) ( G / s^* ) + m^* / n -- G is the total number of groups and m^* the number of coefficients in the s^* groups which contain ^* -- and improve over the least-squares case. We show that, when the signal is strongly group-sparse, Group L1-L2 is superior to L1 and Slope. In addition, we adapt our approach to the sub-Gaussian linear regression framework and reach the optimal minimax rate for Lasso, and an improved rate for Group-Lasso. Finally, we release an accelerated proximal algorithm that computes the nine main convex estimators of interest when the number of variables is of the order of 100,000s.

Tasks

Reproductions