Abstract
Diffusion-based image generation models excel at producing high-qualitysynthetic content, but suffer from slow and computationally expensiveinference. Prior work has attempted to mitigate this by caching and reusingfeatures within diffusion transformers across inference steps. These methods,however, often rely on rigid heuristics that result in limited acceleration orpoor generalization across architectures. We propose Evolutionary Caching toAccelerate Diffusion models (ECAD), a genetic algorithm that learns efficient,per-model, caching schedules forming a Pareto frontier, using only a small setof calibration prompts. ECAD requires no modifications to network parameters orreference images. It offers significant inference speedups, enablesfine-grained control over the quality-latency trade-off, and adapts seamlesslyto different diffusion models. Notably, ECAD's learned schedules can generalizeeffectively to resolutions and model variants not seen during calibration. Weevaluate ECAD on PixArt-alpha, PixArt-Sigma, and FLUX-1.dev using multiplemetrics (FID, CLIP, Image Reward) across diverse benchmarks (COCO, MJHQ-30k,PartiPrompts), demonstrating consistent improvements over previous approaches.On PixArt-alpha, ECAD identifies a schedule that outperforms the previousstate-of-the-art method by 4.47 COCO FID while increasing inference speedupfrom 2.35x to 2.58x. Our results establish ECAD as a scalable and generalizableapproach for accelerating diffusion inference. Our project website is availableat https://aniaggarwal.github.io/ecad and our code is available athttps://github.com/aniaggarwal/ecad.