SOTAVerified

Tree-Guided L_1-Convex Clustering

2025-03-31Code Available0· sign in to hype

Bingyuan Zhang, Yoshikazu Terada

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Convex clustering is a modern clustering framework that guarantees globally optimal solutions and performs comparably to other advanced clustering methods. However, obtaining a complete dendrogram (clusterpath) for large-scale datasets remains computationally challenging due to the extensive costs associated with iterative optimization approaches. To address this limitation, we develop a novel convex clustering algorithm called Tree-Guided L_1-Convex Clustering (TGCC). We first focus on the fact that the loss function of L_1-convex clustering with tree-structured weights can be efficiently optimized using a dynamic programming approach. We then develop an efficient cluster fusion algorithm that utilizes the tree structure of the weights to accelerate the optimization process and eliminate the issue of cluster splits commonly observed in convex clustering. By combining the dynamic programming approach with the cluster fusion algorithm, the TGCC algorithm achieves superior computational efficiency without sacrificing clustering performance. Remarkably, our TGCC algorithm can construct a complete clusterpath for 10^6 points in R^2 within 15 seconds on a standard laptop without the need for parallel or distributed computing frameworks. Moreover, we extend the TGCC algorithm to develop biclustering and sparse convex clustering algorithms.

Tasks

Reproductions