EDICT: Exact Diffusion Inversion via Coupled Transformations

  • 2022-11-22 18:02:49
  • Bram Wallace, Akash Gokul, Nikhil Naik
  • 36


Finding an initial noise vector that produces an input image when fed intothe diffusion process (known as inversion) is an important problem in denoisingdiffusion models (DDMs), with applications for real image editing. Thestate-of-the-art approach for real image editing with inversion uses denoisingdiffusion implicit models (DDIMs) to deterministically noise the image to theintermediate state along the path that the denoising would follow given theoriginal conditioning. However, DDIM inversion for real images is unstable asit relies on local linearization assumptions, which result in the propagationof errors, leading to incorrect image reconstruction and loss of content. Toalleviate these problems, we propose Exact Diffusion Inversion via CoupledTransformations (EDICT), an inversion method that draws inspiration from affinecoupling layers. EDICT enables mathematically exact inversion of real andmodel-generated images by maintaining two coupled noise vectors which are usedto invert each other in an alternating fashion. Using Stable Diffusion, astate-of-the-art latent diffusion model, we demonstrate that EDICT successfullyreconstructs real images with high fidelity. On complex image datasets likeMS-COCO, EDICT reconstruction significantly outperforms DDIM, improving themean square error of reconstruction by a factor of two. Using noise vectorsinverted from real images, EDICT enables a wide range of image edits--fromlocal and global semantic edits to image stylization--while maintainingfidelity to the original image structure. EDICT requires no modeltraining/finetuning, prompt tuning, or extra data and can be combined with anypretrained DDM. Code will be made available shortly.


