U-Net Fixed-Point Quantization for Medical Image Segmentation

Abstract

Model quantization is leveraged to reduce the memory consumption and thecomputation time of deep neural networks. This is achieved by representingweights and activations with a lower bit resolution when compared to their highprecision floating point counterparts. The suitable level of quantization isdirectly related to the model performance. Lowering the quantization precision(e.g. 2 bits), reduces the amount of memory required to store model parametersand the amount of logic required to implement computational blocks, whichcontributes to reducing the power consumption of the entire system. Thesebenefits typically come at the cost of reduced accuracy. The main challenge isto quantize a network as much as possible, while maintaining the performanceaccuracy. In this work, we present a quantization method for the U-Netarchitecture, a popular model in medical image segmentation. We then apply ourquantization algorithm to three datasets: (1) the Spinal Cord Gray MatterSegmentation (GM), (2) the ISBI challenge for segmentation of neuronalstructures in Electron Microscopic (EM), and (3) the public National Instituteof Health (NIH) dataset for pancreas segmentation in abdominal CT scans. Thereported results demonstrate that with only 4 bits for weights and 6 bits foractivations, we obtain 8 fold reduction in memory requirements while loosingonly 2.21%, 0.57% and 2.09% dice overlap score for EM, GM and NIH datasetsrespectively. Our fixed point quantization provides a flexible trade offbetween accuracy and memory requirement which is not provided by previousquantization methods for U-Net such as TernaryNet.

Quick Read (beta)

loading the full paper ...