Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity

Abstract

Online optimization has been a successful framework for solving large-scaleproblems under computational constraints and partial information. Currentmethods for online convex optimization require either a projection or exactgradient computation at each step, both of which can be prohibitively expensivefor large-scale applications. At the same time, there is a growing trend ofnon-convex optimization in machine learning community and a need for onlinemethods. Continuous submodular functions, which exhibit a natural diminishingreturns condition, have recently been proposed as a broad class of non-convexfunctions which may be efficiently optimized. Although online methods have beenintroduced, they suffer from similar problems. In this work, we proposeMeta-Frank-Wolfe, the first online projectionfree algorithm that usesstochastic gradient estimates. The algorithm relies on a careful sampling ofgradients in each round and achieves the optimal $O(\sqrt{T})$ adversarialregret bounds for convex and continuous submodular optimization. We alsopropose One-Shot Frank-Wolfe, a simpler algorithm which requires only a singlestochastic gradient estimate in each round and achieves a $O(T^{2/3})$stochastic regret bound for convex and continuous submodular optimization. Weapply our methods to develop a novel "lifting" framework for the onlinediscrete submodular maximization and also see that they outperform currentstate of the art techniques on an extensive set of experiments.

Quick Read (beta)

loading the full paper ...