ClustML: A Measure of Cluster Pattern Complexity in Scatterplots Learnt from Human-labeled Groupings

  • 2024-05-01 08:31:04
  • Mostafa M. Abbas, Ehsan Ullah, Abdelkader Baggag, Halima Bensmail, Michael Sedlmair, MichaĆ«l Aupetit
Visual quality measures (VQMs) are designed to support analysts byautomatically detecting and quantifying patterns in visualizations. We proposea new VQM for visual grouping patterns in scatterplots, called ClustML, whichis trained on previously collected human subject judgments. Our model encodesscatterplots in the parametric space of a Gaussian Mixture Model and uses aclassifier trained on human judgment data to estimate the perceptual complexityof grouping patterns. The numbers of initial mixture components and finalcombined groups. It improves on existing VQMs, first, by better estimatinghuman judgments on two-Gaussian cluster patterns and, second, by giving higheraccuracy when ranking general cluster patterns in scatterplots. We use it toanalyze kinship data for genome-wide association studies, in which experts relyon the visual analysis of large sets of scatterplots. We make the benchmarkdatasets and the new VQM available for practical use and further improvements.


