Clustering Optimisation Method for Highly Connected Biological Data

  • 2022-08-11 18:41:20
  • Richard Tjörnhammar
  • 0

Abstract

Currently, data-driven discovery in biological sciences resides in findingsegmentation strategies in multivariate data that produce sensible descriptionsof the data. Clustering is but one of several approaches and sometimes fallsshort because of difficulties in assessing reasonable cutoffs, the number ofclusters that need to be formed or that an approach fails to preservetopological properties of the original system in its clustered form. In thiswork, we show how a simple metric for connectivity clustering evaluation leadsto an optimised segmentation of biological data. The novelty of the work resides in the creation of a simple optimisationmethod for clustering crowded data. The resulting clustering approach onlyrelies on metrics derived from the inherent properties of the clustering. Thenew method facilitates knowledge for optimised clustering, which is easy toimplement. We discuss how the clustering optimisation strategy corresponds to the viableinformation content yielded by the final segmentation. We further elaborate onhow the clustering results, in the optimal solution, corresponds to priorknowledge of three different data sets.

 

Quick Read (beta)

loading the full paper ...