Principal component-guided sparse regression

Abstract

We propose a new method for supervised learning, especially suited to widedata where the number of features is much greater than the number ofobservations. The method combines the lasso ($\ell_1$) sparsity penalty with aquadratic penalty that shrinks the coefficient vector toward the leadingprincipal components of the feature matrix. We call the proposed method the"Lariat". The method can be especially powerful if the features arepre-assigned to groups (such as cell-pathways, assays or protein interactionnetworks). In that case, the Lariat shrinks each group-wise component of thesolution toward the leading principal components of that group. In the process,it also carries out selection of the feature groups. We provide some theory forthis method and illustrate it on a number of simulated and real data examples.

Quick Read (beta)

loading the full paper ...