Learning to Learn with Generative Models of Neural Network Checkpoints

  • 2022-09-26 18:59:58
  • William Peebles, Ilija Radosavovic, Tim Brooks, Alexei A. Efros, Jitendra Malik
  • 74

Abstract

We explore a data-driven approach for learning to optimize neural networks.We construct a dataset of neural network checkpoints and train a generativemodel on the parameters. In particular, our model is a conditional diffusiontransformer that, given an initial input parameter vector and a prompted loss,error, or return, predicts the distribution over parameter updates that achievethe desired metric. At test time, it can optimize neural networks with unseenparameters for downstream tasks in just one update. We find that our approachsuccessfully generates parameters for a wide range of loss prompts. Moreover,it can sample multimodal parameter solutions and has favorable scalingproperties. We apply our method to different neural network architectures andtasks in supervised and reinforcement learning.

 

Quick Read (beta)

loading the full paper ...