Abstract
Generative models, with their success in image and video generation, haverecently been explored for synthesizing effective neural network weights. Theseapproaches take trained neural network checkpoints as training data, and aim togenerate high-performing neural network weights during inference. In this work,we examine four representative methods on their ability to generate novel modelweights, i.e., weights that are different from the checkpoints seen duringtraining. Surprisingly, we find that these methods synthesize weights largelyby memorization: they produce either replicas, or at best simpleinterpolations, of the training checkpoints. Current methods fail to outperformsimple baselines, such as adding noise to the weights or taking a simple weightensemble, in obtaining different and simultaneously high-performing models. Wefurther show that this memorization cannot be effectively mitigated bymodifying modeling factors commonly associated with memorization in imagediffusion models, or applying data augmentations. Our findings provide arealistic assessment of what types of data current generative models can model,and highlight the need for more careful evaluation of generative models in newdomains. Our code is available athttps://github.com/boyazeng/weight_memorization.