Binarsity: a penalization for one-hot encoded features in linear supervised learning

Abstract

This paper deals with the problem of large-scale linear supervised learningin settings where a large number of continuous features are available. Wepropose to combine the well-known trick of one-hot encoding of continuousfeatures with a new penalization called \emph{binarsity}. In each group ofbinary features coming from the one-hot encoding of a single raw continuousfeature, this penalization uses total-variation regularization together with anextra linear constraint. This induces two interesting properties on the modelweights of the one-hot encoded features: they are piecewise constant, and areeventually block sparse. Non-asymptotic oracle inequalities for generalizedlinear models are proposed. Moreover, under a sparse additive model assumption,we prove that our procedure matches the state-of-the-art in this setting.Numerical experiments illustrate the good performances of our approach onseveral datasets. It is also noteworthy that our method has a numericalcomplexity comparable to standard $\ell_1$ penalization.

Quick Read (beta)

loading the full paper ...