Improving the Training of Rectified Flows

Abstract

Diffusion models have shown great promise for image and video generation, butsampling from state-of-the-art models requires expensive numerical integrationof a generative ODE. One approach for tackling this problem is rectified flows,which iteratively learn smooth ODE paths that are less susceptible totruncation error. However, rectified flows still require a relatively largenumber of function evaluations (NFEs). In this work, we propose improvedtechniques for training rectified flows, allowing them to compete withknowledge distillation methods even in the low NFE setting. Our main insight isthat under realistic settings, a single iteration of the Reflow algorithm fortraining rectified flows is sufficient to learn nearly straight trajectories;hence, the current practice of using multiple Reflow iterations is unnecessary.We thus propose techniques to improve one-round training of rectified flows,including a U-shaped timestep distribution and LPIPS-Huber premetric. Withthese techniques, we improve the FID of the previous 2-rectified flow by up to72% in the 1 NFE setting on CIFAR-10. On ImageNet 64$\times$64, our improvedrectified flow outperforms the state-of-the-art distillation methods such asconsistency distillation and progressive distillation in both one-step andtwo-step settings and rivals the performance of improved consistency training(iCT) in FID. Code is available at https://github.com/sangyun884/rfpp.

Quick Read (beta)

loading the full paper ...