Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

Abstract

Exploiting sparsity in deep neural networks (DNNs) has been a promising areato meet the growing computation need of modern DNNs. However, in practice,sparse DNN acceleration still faces a key challenge. To minimize the overheadof sparse acceleration, hardware designers have proposed structured sparsehardware support recently, which provides limited flexibility and requiresextra model fine-tuning. Moreover, any sparse model fine-tuned for certainstructured sparse hardware cannot be accelerated by other structured hardware.To bridge the gap between sparse DNN models and hardware, this paper proposestensor approximation via structured decomposition (TASD), which leverages thedistributive property in linear algebra to turn any sparse tensor into a seriesof structured sparse tensors. Next, we develop a software framework, TASDER, toaccelerate DNNs by searching layer-wise, high-quality structured decompositionfor both weight and activation tensors so that they can be accelerated by anysystems with structured sparse hardware support. Evaluation results show that,by exploiting prior structured sparse hardware baselines, our method canaccelerate off-the-shelf dense and sparse DNNs without fine-tuning and improvesenergy-delay-product by up to 83% and 74% on average.

Quick Read (beta)

loading the full paper ...