Neural Network Parameter Diffusion

Abstract

Diffusion models have achieved remarkable success in image and videogeneration. In this work, we demonstrate that diffusion models can also\textit{generate high-performing neural network parameters}. Our approach issimple, utilizing an autoencoder and a standard latent diffusion model. Theautoencoder extracts latent representations of a subset of the trained networkparameters. A diffusion model is then trained to synthesize these latentparameter representations from random noise. It then generates newrepresentations that are passed through the autoencoder's decoder, whoseoutputs are ready to use as new subsets of network parameters. Across variousarchitectures and datasets, our diffusion process consistently generates modelsof comparable or improved performance over trained networks, with minimaladditional cost. Notably, we empirically find that the generated models are notmemorizing the trained networks. Our results encourage more exploration on theversatile use of diffusion models.

Quick Read (beta)

loading the full paper ...