HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

Abstract

The inversion of real images into StyleGAN's latent space is a well-studiedproblem. Nevertheless, applying existing approaches to real-world scenariosremains an open challenge, due to an inherent trade-off between reconstructionand editability: latent space regions which can accurately represent realimages typically suffer from degraded semantic control. Recent work proposes tomitigate this trade-off by fine-tuning the generator to add the target image towell-behaved, editable regions of the latent space. While promising, thisfine-tuning scheme is impractical for prevalent use as it requires a lengthytraining phase for each new image. In this work, we introduce this approachinto the realm of encoder-based inversion. We propose HyperStyle, ahypernetwork that learns to modulate StyleGAN's weights to faithfully express agiven image in editable regions of the latent space. A naive modulationapproach would require training a hypernetwork with over three billionparameters. Through careful network design, we reduce this to be in line withexisting encoders. HyperStyle yields reconstructions comparable to those ofoptimization techniques with the near real-time inference capabilities ofencoders. Lastly, we demonstrate HyperStyle's effectiveness on severalapplications beyond the inversion task, including the editing of out-of-domainimages which were never seen during training.

Quick Read (beta)

loading the full paper ...