Learning to Discretize Denoising Diffusion ODEs

Abstract

Diffusion Probabilistic Models (DPMs) are generative models showingcompetitive performance in various domains, including image synthesis and 3Dpoint cloud generation. Sampling from pre-trained DPMs involves multiple neuralfunction evaluations (NFEs) to transform Gaussian noise samples into images,resulting in higher computational costs compared to single-step generativemodels such as GANs or VAEs. Therefore, reducing the number of NFEs whilepreserving generation quality is crucial. To address this, we propose LD3, alightweight framework designed to learn the optimal time discretization forsampling. LD3 can be combined with various samplers and consistently improvesgeneration quality without having to retrain resource-intensive neuralnetworks. We demonstrate analytically and empirically that LD3 improvessampling efficiency with much less computational overhead. We evaluate ourmethod with extensive experiments on 7 pre-trained models, coveringunconditional and conditional sampling in both pixel-space and latent-spaceDPMs. We achieve FIDs of 2.38 (10 NFE), and 2.27 (10 NFE) on unconditionalCIFAR10 and AFHQv2 in 5-10 minutes of training. LD3 offers an efficientapproach to sampling from pre-trained diffusion models. Code is available athttps://github.com/vinhsuhi/LD3.

Quick Read (beta)

loading the full paper ...