Abstract
Diagnosing deep neural networks (DNNs) through the eigenspectrum of weightmatrices has been an active area of research in recent years. At a high level,eigenspectrum analysis of DNNs involves measuring the heavytailness of theempirical spectral densities (ESD) of weight matrices. It provides insight intohow well a model is trained and can guide decisions on assigning betterlayer-wise training hyperparameters. In this paper, we address a challengeassociated with such eigenspectrum methods: the impact of the aspect ratio ofweight matrices on estimated heavytailness metrics. We demonstrate thatmatrices of varying sizes (and aspect ratios) introduce a non-negligible biasin estimating heavytailness metrics, leading to inaccurate model diagnosis andlayer-wise hyperparameter assignment. To overcome this challenge, we proposeFARMS (Fixed-Aspect-Ratio Matrix Subsampling), a method that normalizes theweight matrices by subsampling submatrices with a fixed aspect ratio. Insteadof measuring the heavytailness of the original ESD, we measure the average ESDof these subsampled submatrices. We show that measuring the heavytailness ofthese submatrices with the fixed aspect ratio can effectively mitigate theaspect ratio bias. We validate our approach across various optimizationtechniques and application domains that involve eigenspectrum analysis ofweights, including image classification in computer vision (CV) models,scientific machine learning (SciML) model training, and large language model(LLM) pruning. Our results show that despite its simplicity, FARMS uniformlyimproves the accuracy of eigenspectrum analysis while enabling more effectivelayer-wise hyperparameter assignment in these application domains. In one ofthe LLM pruning experiments, FARMS reduces the perplexity of the LLaMA-7B modelby 17.3% when compared with the state-of-the-art method.