Abstract
Diffusion models have emerged as the de facto choice for generating visualsignals. However, training a single model to predict noise across variouslevels poses significant challenges, necessitating numerous iterations andincurring significant computational costs. Various approaches, such as lossweighting strategy design and architectural refinements, have been introducedto expedite convergence. In this study, we propose a novel approach to designthe noise schedule for enhancing the training of diffusion models. Our keyinsight is that the importance sampling of the logarithm of the Signal-to-Noiseratio (logSNR), theoretically equivalent to a modified noise schedule, isparticularly beneficial for training efficiency when increasing the samplefrequency around $\log \text{SNR}=0$. We empirically demonstrate thesuperiority of our noise schedule over the standard cosine schedule.Furthermore, we highlight the advantages of our noise schedule design on theImageNet benchmark, showing that the designed schedule consistently benefitsdifferent prediction targets.