Few-shot Semantic Image Synthesis Using StyleGAN Prior

Abstract

This paper tackles a challenging problem of generating photorealistic imagesfrom semantic layouts in few-shot scenarios where annotated training pairs arehardly available but pixel-wise annotation is quite costly. We present atraining strategy that performs pseudo labeling of semantic masks using theStyleGAN prior. Our key idea is to construct a simple mapping between theStyleGAN feature and each semantic class from a few examples of semantic masks.With such mappings, we can generate an unlimited number of pseudo semanticmasks from random noise to train an encoder for controlling a pre-trainedStyleGAN generator. Although the pseudo semantic masks might be too coarse forprevious approaches that require pixel-aligned masks, our framework cansynthesize high-quality images from not only dense semantic masks but alsosparse inputs such as landmarks and scribbles. Qualitative and quantitativeresults with various datasets demonstrate improvement over previous approacheswith respect to layout fidelity and visual quality in as few as one- orfive-shot settings.

Quick Read (beta)

loading the full paper ...