Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Abstract

Synthesizing visual content that meets users' needs often requires flexibleand precise controllability of the pose, shape, expression, and layout of thegenerated objects. Existing approaches gain controllability of generativeadversarial networks (GANs) via manually annotated training data or a prior 3Dmodel, which often lack flexibility, precision, and generality. In this work,we study a powerful yet much less explored way of controlling GANs, that is, to"drag" any points of the image to precisely reach target points in auser-interactive manner, as shown in Fig.1. To achieve this, we proposeDragGAN, which consists of two main components: 1) a feature-based motionsupervision that drives the handle point to move towards the target position,and 2) a new point tracking approach that leverages the discriminativegenerator features to keep localizing the position of the handle points.Through DragGAN, anyone can deform an image with precise control over wherepixels go, thus manipulating the pose, shape, expression, and layout of diversecategories such as animals, cars, humans, landscapes, etc. As thesemanipulations are performed on the learned generative image manifold of a GAN,they tend to produce realistic outputs even for challenging scenarios such ashallucinating occluded content and deforming shapes that consistently followthe object's rigidity. Both qualitative and quantitative comparisonsdemonstrate the advantage of DragGAN over prior approaches in the tasks ofimage manipulation and point tracking. We also showcase the manipulation ofreal images through GAN inversion.

Quick Read (beta)

loading the full paper ...