SOTAVerified

Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more

2018-02-21Unverified0· sign in to hype

Elad Tolochinsky, Ibrahim Jubran, Dan Feldman

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Coreset (or core-set) is a small weighted subset Q of an input set P with respect to a given monotonic function f:RR that provably approximates its fitting loss _p Pf(p x) to any given xR^d. Using Q we can obtain approximation of x^* that minimizes this loss, by running existing optimization algorithms on Q. In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than n=|P| for general monotonic loss functions. (ii) A proof that, under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for any input P. (iii) A generic coreset construction algorithm that computes such a small coreset Q in O(nd+n n) time, and (iv) Experimental results which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.

Tasks

Reproductions