Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees
Vitaly Feldman, Pravesh Kothari, Jan Vondrak
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube \0,1\^n. Our main result is the following structural theorem: any submodular function is -close in _2 to a real-valued decision tree (DT) of depth O(1/^2). This immediately implies that any submodular function is -close to a function of at most 2^O(1/^2) variables and has a spectral _1 norm of 2^O(1/^2). It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree O(1/^2) (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank 4/^2 and a proof that any rank-r DT can be -approximated by a DT of depth 52(r+(1/)). We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time O(n^2) 2^O(1/^4). The best previous algorithm for the problem requires n^O(1/^2) time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings. We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of 2^(1/^2/3) on the complexity of learning monotone submodular functions in any reasonable model; (2) computational lower bound of n^(1/^2/3) based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.