Deep Linear Probe Generators for Weight Space Learning

Abstract

Weight space learning aims to extract information about a neural network,such as its training dataset or generalization error. Recent approaches learndirectly from model weights, but this presents many challenges as weights arehigh-dimensional and include permutation symmetries between neurons. Analternative approach, Probing, represents a model by passing a set of learnedinputs (probes) through the model, and training a predictor on top of thecorresponding outputs. Although probing is typically not used as a stand aloneapproach, our preliminary experiment found that a vanilla probing baselineworked surprisingly well. However, we discover that current probe learningstrategies are ineffective. We therefore propose Deep Linear Probe Generators(ProbeGen), a simple and effective modification to probing approaches. ProbeGenadds a shared generator module with a deep linear architecture, providing aninductive bias towards structured probes thus reducing overfitting. Whilesimple, ProbeGen performs significantly better than the state-of-the-art and isvery efficient, requiring between 30 to 1000 times fewer FLOPs than other topapproaches.

Quick Read (beta)

loading the full paper ...