Abstract
In many learning applications, the parameters in a model are structurallyconstrained in a way that can be modeled as them lying on a Riemannianmanifold. Riemannian optimization, wherein procedures to enforce an iterativeminimizing sequence to be constrained to the manifold, is used to train suchmodels. At the same time, tame geometry has become a significant topologicaldescription of nonsmooth functions that appear in the landscapes of trainingneural networks and other important models with structural compositions ofcontinuous nonlinear functions with nonsmooth maps. In this paper, we study theproperties of such stratifiable functions on a manifold and the behavior ofretracted stochastic gradient descent, with diminishing stepsizes, forminimizing such functions.