Classifying the classifier: dissecting the weight space of neural networks

Abstract

This paper presents an empirical study on the weights of neural networks,where we interpret each model as a point in a high-dimensional space -- theneural weight space. To explore the complex structure of this space, we samplefrom a diverse selection of training variations (dataset, optimizationprocedure, architecture, etc.) of neural network classifiers, and train a largenumber of models to represent the weight space. Then, we use a machine learningapproach for analyzing and extracting information from this space. Mostcentrally, we train a number of novel deep meta-classifiers with the objectiveof classifying different properties of the training setup by identifying theirfootprints in the weight space. Thus, the meta-classifiers probe for patternsinduced by hyper-parameters, so that we can quantify how much, where, and whenthese are encoded through the optimization process. This provides a novel andcomplementary view for explainable AI, and we show how meta-classifiers canreveal a great deal of information about the training setup and optimization,by only considering a small subset of randomly selected consecutive weights. Topromote further research on the weight space, we release the neural weightspace (NWS) dataset -- a collection of 320K weight snapshots from 16Kindividually trained deep neural networks.

Quick Read (beta)

loading the full paper ...