Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums

  • 2021-04-07 16:37:05
  • Chaobing Song, Stephen J. Wright, Jelena Diakonikolas
  • 0

Abstract

We study structured nonsmooth convex finite-sum optimization that appearswidely in machine learning applications, including support vector machines andleast absolute deviation. For the primal-dual formulation of this problem, wepropose a novel algorithm called \emph{Variance Reduction via Primal-DualAccelerated Dual Averaging (\vrpda)}. In the nonsmooth and general convexsetting, \vrpda~has the overall complexity $O(nd\log\min \{1/\epsilon, n\} +d/\epsilon )$ in terms of the primal-dual gap, where $n$ denotes the number ofsamples, $d$ the dimension of the primal variables, and $\epsilon$ the desiredaccuracy. In the nonsmooth and strongly convex setting, the overall complexityof \vrpda~becomes $O(nd\log\min\{1/\epsilon, n\} + d/\sqrt{\epsilon})$ in termsof both the primal-dual gap and the distance between iterate and optimalsolution. Both these results for \vrpda~improve significantly onstate-of-the-art complexity estimates, which are $O(nd\log \min\{1/\epsilon,n\} + \sqrt{n}d/\epsilon)$ for the nonsmooth and general convex setting and$O(nd\log \min\{1/\epsilon, n\} + \sqrt{n}d/\sqrt{\epsilon})$ for the nonsmoothand strongly convex setting, in a much more simple and straightforward way.Moreover, both complexities are better than \emph{lower} bounds for generalconvex finite sums that lack the particular (common) structure that weconsider. Our theoretical results are supported by numerical experiments, whichconfirm the competitive performance of \vrpda~compared to state-of-the-art.

 

Quick Read (beta)

loading the full paper ...