On Approximability of _2^2 Min-Sum Clustering
Karthik C. S., Euiwoong Lee, Yuval Rabani, Chris Schwiegelshohn, Samson Zhou
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The _2^2 min-sum k-clustering problem is to partition an input set into clusters C_1,,C_k to minimize _i=1^k_p,q C_i\|p-q\|_2^2. Although _2^2 min-sum k-clustering is NP-hard, it is not known whether it is NP-hard to approximate _2^2 min-sum k-clustering beyond a certain factor. In this paper, we give the first hardness-of-approximation result for the _2^2 min-sum k-clustering problem. We show that it is NP-hard to approximate the objective to a factor better than 1.056 and moreover, assuming a balanced variant of the Johnson Coverage Hypothesis, it is NP-hard to approximate the objective to a factor better than 1.327. We then complement our hardness result by giving a nearly linear time parameterized PTAS for _2^2 min-sum k-clustering running in time O(n^1+o(1)d ((k^-1)^O(1))), where d is the underlying dimension of the input dataset. Finally, we consider a learning-augmented setting, where the algorithm has access to an oracle that outputs a label i[k] for input point, thereby implicitly partitioning the input dataset into k clusters that induce an approximately optimal solution, up to some amount of adversarial error [0,12). We give a polynomial-time algorithm that outputs a 1+(1-)^2-approximation to _2^2 min-sum k-clustering, for a fixed constant >0.