A Random Walk Approach to First-Order Stochastic Convex Optimization

Abstract

Online minimization of an unknown convex function over a convex and compactset is considered under first-order stochastic bandit feedback, which returns arandom realization of the gradient of the function at each query point. Withoutknowing the distribution of the random gradients, a learning algorithmsequentially chooses query points with the objective of minimizing regretdefined as the expected cumulative loss of the function values at the querypoints in excess to the minimum value of the function. An active searchstrategy based on devising a biased random walk on an infinite-depth treeconstructed through successive partitioning of the domain of the function isdeveloped. It is shown that the biased random walk moves toward the optimalpoint in a geometric rate, leading to an order-optimal regret performance of$\Tilde{\O}(\sqrt T)$. The structural properties of this random-walk basedstrategy admits detailed finite-time regret analysis. By localizing dataprocessing to small subsets of the input domain based on the tree structure, itenjoys $\O(1)$ computation and memory complexity per query and allows dynamicallocation of limited data storage.

Quick Read (beta)

loading the full paper ...