Transparent Image Layer Diffusion using Latent Transparency

Abstract

We present LayerDiffusion, an approach enabling large-scale pretrained latentdiffusion models to generate transparent images. The method allows generationof single transparent images or of multiple transparent layers. The methodlearns a "latent transparency" that encodes alpha channel transparency into thelatent manifold of a pretrained latent diffusion model. It preserves theproduction-ready quality of the large diffusion model by regulating the addedtransparency as a latent offset with minimal changes to the original latentdistribution of the pretrained model. In this way, any latent diffusion modelcan be converted into a transparent image generator by finetuning it with theadjusted latent space. We train the model with 1M transparent image layer pairscollected using a human-in-the-loop collection scheme. We show that latenttransparency can be applied to different open source image generators, or beadapted to various conditional control systems to achieve applications likeforeground/background-conditioned layer generation, joint layer generation,structural control of layer contents, etc. A user study finds that in mostcases (97%) users prefer our natively generated transparent content overprevious ad-hoc solutions such as generating and then matting. Users alsoreport the quality of our generated transparent images is comparable to realcommercial transparent assets like Adobe Stock.

Quick Read (beta)

loading the full paper ...