Continuation of Nesterov's Smoothing for Regression with Structured Sparsity in High-Dimensional Neuroimaging

Abstract

Predictive models can be used on high-dimensional brain images for diagnosisof a clinical condition. Spatial regularization through structured sparsityoffers new perspectives in this context and reduces the risk of overfitting themodel while providing interpretable neuroimaging signatures by forcing thesolution to adhere to domain-specific constraints. Total Variation (TV)enforces spatial smoothness of the solution while segmenting predictive regionsfrom the background. We consider the problem of minimizing the sum of a smoothconvex loss, a non-smooth convex penalty (whose proximal operator is known) anda wide range of possible complex, non-smooth convex structured penalties suchas TV or overlapping group Lasso. Existing solvers are either limited in thefunctions they can minimize or in their practical capacity to scale tohigh-dimensional imaging data. Nesterov's smoothing technique can be used tominimize a large number of non-smooth convex structured penalties butreasonable precision requires a small smoothing parameter, which slows down theconvergence speed. To benefit from the versatility of Nesterov's smoothingtechnique, we propose a first order continuation algorithm, CONESTA, whichautomatically generates a sequence of decreasing smoothing parameters. Thegenerated sequence maintains the optimal convergence speed towards any globallydesired precision. Our main contributions are: To propose an expression of theduality gap to probe the current distance to the global optimum in order toadapt the smoothing parameter and the convergence speed. We provide aconvergence rate, which is an improvement over classical proximal gradientsmoothing methods. We demonstrate on both simulated and high-dimensionalstructural neuroimaging data that CONESTA significantly outperforms manystate-of-the-art solvers in regard to convergence speed and precision.

Quick Read (beta)

loading the full paper ...