Conditional Image Generation for Learning the Structure of Visual Objects

  • 2018-06-20 16:17:00
  • Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi
  • 10

Abstract

In this paper, we consider the problem of learning landmarks for objectcategories without any manual annotations. We cast this as the problem ofconditionally generating an image of an object from another one, where theimages differ by acquisition time and/or viewpoint. The process is aided byproviding the generator with a keypoint-like representation extracted from thetarget image through a tight bottleneck. This encourages the representation todistil information about the object geometry, which changes from source totarget, while the appearance, which is shared between the source and target, isread off from the source alone. Conditioning simplifies the generation tasksignificantly, to the point that adopting a simple perceptual loss instead ofmore sophisticated approaches such as adversarial training is sufficient tolearn landmarks. We show that our method is applicable to a large variety ofdatasets - faces, people, 3D objects, and digits - without any modifications.We further demonstrate that we can learn landmarks from synthetic imagedeformations or videos, all without manual supervision, while outperformingstate-of-the-art unsupervised landmark detectors.

 

Quick Read (beta)

loading the full paper ...