FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations

Abstract

Neural network-based methods for image processing are becoming widely used inpractical applications. Modern neural networks are computationally expensiveand require specialized hardware, such as graphics processing units. Since suchhardware is not always available in real life applications, there is acompelling need for the design of neural networks for mobile devices. Mobileneural networks typically have reduced number of parameters and require arelatively small number of arithmetic operations. However, they usually stillare executed at the software level and use floating-point calculations. The useof mobile networks without further optimization may not provide sufficientperformance when high processing speed is required, for example, in real-timevideo processing (30 frames per second). In this study, we suggestoptimizations to speed up computations in order to efficiently use alreadytrained neural networks on a mobile device. Specifically, we propose anapproach for speeding up neural networks by moving computation from software tohardware and by using fixed-point calculations instead of floating-point. Wepropose a number of methods for neural network architecture design to improvethe performance with fixed-point calculations. We also show an example of howexisting datasets can be modified and adapted for the recognition task in hand.Finally, we present the design and the implementation of a floating-point gatearray-based device to solve the practical problem of real-time handwrittendigit classification from mobile camera video feed.

Quick Read (beta)

loading the full paper ...