Abstract
We present a theory and an objective function for similarity-basedhierarchical clustering of probabilistic partial orders and directed acyclicgraphs (DAGs). Specifically, given elements $x \le y$ in the partial order, andtheir respective clusters $[x]$ and $[y]$, the theory yields an order relation$\le'$ on the clusters such that $[x]\le'[y]$. The theory provides a concisedefinition of order-preserving hierarchical clustering, and offers aclassification theorem identifying the order-preserving trees (dendrograms). Todetermine the optimal order-preserving trees, we develop an objective functionthat frames the problem as a bi-objective optimisation, aiming to satisfy boththe order relation and the similarity measure. We prove that the optimal treesunder the objective are both order-preserving and exhibit high-qualityhierarchical clustering. Since finding an optimal solution is NP-hard, weintroduce a polynomial-time approximation algorithm and demonstrate that themethod outperforms existing methods for order-preserving hierarchicalclustering by a significant margin.