Structure-Property Maps with Kernel Principal Covariates Regression

  • 2020-05-21 15:58:36
  • Benjamin A. Helfrecht, Rose K. Cersonsky, Guillaume Fraux, Michele Ceriotti
  • 0


Data analyses based on linear methods constitute the simplest, most robust,and transparent approaches to the automatic processing of large amounts of datafor building supervised or unsupervised machine learning models. Principalcovariates regression (PCovR) is an underappreciated method that interpolatesbetween principal component analysis and linear regression, and can be used toconveniently reveal structure-property relations in terms ofsimple-to-interpret, low-dimensional maps. Here we provide a pedagogic overviewof these data analysis schemes, including the use of the kernel trick tointroduce an element of non-linearity, while maintaining most of theconvenience and the simplicity of linear approaches. We then introduce akernelized version of PCovR and a sparsified extension, and demonstrate theperformance of this approach in revealing and predicting structure-propertyrelations in chemistry and materials science, showing a variety of examplesincluding elemental carbon, porous silicate frameworks, organic molecules,amino acid conformers, and molecular materials.


Quick Read (beta)

loading the full paper ...