ET-Lasso: Efficient Tuning of Lasso for High-Dimensional Data

Abstract

The L1 regularization (Lasso) has proven to be a versatile tool to selectrelevant features and estimate the model coefficients simultaneously. Despiteits popularity, it is very challenging to guarantee the feature selectionconsistency of Lasso. One way to improve the feature selection consistency isto select an ideal tuning parameter. Traditional tuning criteria mainly focuson minimizing the estimated prediction error or maximizing the posterior modelprobability, such as cross-validation and BIC, which may either betime-consuming or fail to control the false discovery rate (FDR) when thenumber of features is extremely large. The other way is to introducepseudo-features to learn the importance of the original ones. Recently, theKnockoff filter is proposed to control the FDR when performing featureselection. However, its performance is sensitive to the choice of the expectedFDR threshold. Motivated by these ideas, we propose a new method usingpseudo-features to obtain an ideal tuning parameter. In particular, we presentthe Efficient Tuning of Lasso (ET-Lasso) to separate active and inactivefeatures by adding permuted features as pseudo-features in linear models. Thepseudo-features are constructed to be inactive by nature, which can be used toobtain a cutoff to select the tuning parameter that separates active andinactive features. Experimental studies on both simulations and real-world dataapplications are provided to show that ET-Lasso can effectively and efficientlyselect active features under a wide range of different scenarios.

Quick Read (beta)

loading the full paper ...