Abstract
Diffusion-based generative models have demonstrated exceptional performance,yet their iterative sampling procedures remain computationally expensive. Aprominent strategy to mitigate this cost is distillation, with offlinedistillation offering particular advantages in terms of efficiency, modularity,and flexibility. In this work, we identify two key observations that motivate aprincipled distillation framework: (1) while diffusion models have been viewedthrough the lens of dynamical systems theory, powerful and underexplored toolscan be further leveraged; and (2) diffusion models inherently imposestructured, semantically coherent trajectories in latent space. Building onthese observations, we introduce the Koopman Distillation Model (KDM), a noveloffline distillation approach grounded in Koopman theory - a classicalframework for representing nonlinear dynamics linearly in a transformed space.KDM encodes noisy inputs into an embedded space where a learned linear operatorpropagates them forward, followed by a decoder that reconstructs clean samples.This enables single-step generation while preserving semantic fidelity. Weprovide theoretical justification for our approach: (1) under mild assumptions,the learned diffusion dynamics admit a finite-dimensional Koopmanrepresentation; and (2) proximity in the Koopman latent space correlates withsemantic similarity in the generated outputs, allowing for effective trajectoryalignment. KDM achieves highly competitive performance across standard offlinedistillation benchmarks.