Comparing Multi-class, Binary and Hierarchical Machine Learning Classification schemes for variable stars

  • 2019-07-18 17:59:02
  • Zafiirah Hosenie, Robert Lyon, Benjamin Stappers, Arrykrishna Mootoovaloo
  • 2

Abstract

Upcoming synoptic surveys are set to generate an unprecedented amount ofdata. This requires an automatic framework that can quickly and efficientlyprovide classification labels for several new object classification challenges.Using data describing 11 types of variable stars from the Catalina Real-TimeTransient Surveys (CRTS), we illustrate how to capture the most importantinformation from computed features and describe detailed methods of how torobustly use Information Theory for feature selection and evaluation. We applythree Machine Learning (ML) algorithms and demonstrate how to optimize theseclassifiers via cross-validation techniques. For the CRTS dataset, we find thatthe Random Forest (RF) classifier performs best in terms of balanced-accuracyand geometric means. We demonstrate substantially improved classificationresults by converting the multi-class problem into a binary classificationtask, achieving a balanced-accuracy rate of $\sim$99 per cent for theclassification of ${\delta}$-Scuti and Anomalous Cepheids (ACEP). Additionally,we describe how classification performance can be improved via converting a'flat-multi-class' problem into a hierarchical taxonomy. We develop a newhierarchical structure and propose a new set of classification features,enabling the accurate identification of subtypes of cepheids, RR Lyrae andeclipsing binary stars in CRTS data.

 

Quick Read (beta)

loading the full paper ...