A Smoother Way to Train Structured Prediction Models

Abstract

We present a framework to train a structured prediction model by performingsmoothing on the inference algorithm it builds upon. Smoothing overcomes thenon-smoothness inherent to the maximum margin structured prediction objective,and paves the way for the use of fast primal gradient-based optimizationalgorithms. We illustrate the proposed framework by developing a novel primalincremental optimization algorithm for the structural support vector machine.The proposed algorithm blends an extrapolation scheme for acceleration and anadaptive smoothing scheme and builds upon the stochastic variance-reducedgradient algorithm. We establish its worst-case global complexity bound andstudy several practical variants, including extensions to deep structuredprediction. We present experimental results on two real-world problems, namelynamed entity recognition and visual object localization. The experimentalresults show that the proposed framework allows us to build upon efficientinference algorithms to develop large-scale optimization algorithms forstructured prediction which can achieve competitive performance on the tworeal-world problems.

Quick Read (beta)

loading the full paper ...