Shapes and Context: In-the-Wild Image Synthesis & Manipulation

Abstract

We introduce a data-driven approach for interactively synthesizingin-the-wild images from semantic label maps. Our approach is dramaticallydifferent from recent work in this space, in that we make use of no learning.Instead, our approach uses simple but classic tools for matching scene context,shapes, and parts to a stored library of exemplars. Though simple, thisapproach has several notable advantages over recent work: (1) because nothingis learned, it is not limited to specific training data distributions (such ascityscapes, facades, or faces); (2) it can synthesize arbitrarilyhigh-resolution images, limited only by the resolution of the exemplar library;(3) by appropriately composing shapes and parts, it can generate anexponentially large set of viable candidate output images (that can say, beinteractively searched by a user). We present results on the diverse COCOdataset, significantly outperforming learning-based approaches on standardimage synthesis metrics. Finally, we explore user-interaction anduser-controllability, demonstrating that our system can be used as a platformfor user-driven content creation.

Quick Read (beta)

loading the full paper ...