An objective function for order preserving hierarchical clustering
Daniel Bakkelund
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We present a theory and an objective function for similarity-based hierarchical clustering of probabilistic partial orders and directed acyclic graphs (DAGs). Specifically, given elements x y in the partial order, and their respective clusters [x] and [y], the theory yields an order relation ' on the clusters such that [x]'[y]. The theory provides a concise definition of order-preserving hierarchical clustering, and offers a classification theorem identifying the order-preserving trees (dendrograms). To determine the optimal order-preserving trees, we develop an objective function that frames the problem as a bi-objective optimisation, aiming to satisfy both the order relation and the similarity measure. We prove that the optimal trees under the objective are both order-preserving and exhibit high-quality hierarchical clustering. Since finding an optimal solution is NP-hard, we introduce a polynomial-time approximation algorithm and demonstrate that the method outperforms existing methods for order-preserving hierarchical clustering by a significant margin.