Abstract
Latent Diffusion Models (LDMs) produce high-quality, photo-realistic images,however, the latency incurred by multiple costly inference iterations canrestrict their applicability. We introduce LatentCRF, a continuous ConditionalRandom Field (CRF) model, implemented as a neural network layer, that modelsthe spatial and semantic relationships among the latent vectors in the LDM. Byreplacing some of the computationally-intensive LDM inference iterations withour lightweight LatentCRF, we achieve a superior balance between quality, speedand diversity. We increase inference efficiency by 33% with no loss in imagequality or diversity compared to the full LDM. LatentCRF is an easy add-on,which does not require modifying the LDM.
Quick Read (beta)
loading the full paper ...