Learning General Halfspaces with General Massart Noise under the Gaussian Distribution
Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the problem of PAC learning halfspaces on R^d with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point x with unknown probability (x) , for some parameter [0,1/2]. The goal is to find a hypothesis with misclassification error of OPT + , where OPT is the error of the target halfspace. This problem had been previously studied under two assumptions: (i) the target halfspace is homogeneous (i.e., the separating hyperplane goes through the origin), and (ii) the parameter is strictly smaller than 1/2. Prior to this work, no nontrivial bounds were known when either of these assumptions is removed. We study the general problem and establish the following: For <1/2, we give a learning algorithm for general halfspaces with sample and computational complexity d^O_((1/))poly(1/), where =\, [f(x) = 1], Pr[f(x) = -1]\ \ is the bias of the target halfspace f. Prior efficient algorithms could only handle the special case of = 1/2. Interestingly, we establish a qualitatively matching lower bound of d^((1/)) on the complexity of any Statistical Query (SQ) algorithm. For = 1/2, we give a learning algorithm for general halfspaces with sample and computational complexity O_(1) d^O((1/)). This result is new even for the subclass of homogeneous halfspaces; prior algorithms for homogeneous Massart halfspaces provide vacuous guarantees for =1/2. We complement our upper bound with a nearly-matching SQ lower bound of d^((1/)), which holds even for the special case of homogeneous halfspaces.