Hardware Trojan Attacks on Neural Networks

Abstract

With the rising popularity of machine learning and the ever increasing demandfor computational power, there is a growing need for hardware optimizedimplementations of neural networks and other machine learning models. As thetechnology evolves, it is also plausible that machine learning or artificialintelligence will soon become consumer electronic products and militaryequipment, in the form of well-trained models. Unfortunately, the modernfabless business model of manufacturing hardware, while economic, leads todeficiencies in security through the supply chain. In this paper, we illuminatethese security issues by introducing hardware Trojan attacks on neuralnetworks, expanding the current taxonomy of neural network security toincorporate attacks of this nature. To aid in this, we develop a novelframework for inserting malicious hardware Trojans in the implementation of aneural network classifier. We evaluate the capabilities of the adversary inthis setting by implementing the attack algorithm on convolutional neuralnetworks while controlling a variety of parameters available to the adversary.Our experimental results show that the proposed algorithm could effectivelyclassify a selected input trigger as a specified class on the MNIST dataset byinjecting hardware Trojans into $0.03\%$, on average, of neurons in the 5thhidden layer of arbitrary 7-layer convolutional neural networks, whileundetectable under the test data. Finally, we discuss the potential defenses toprotect neural networks against hardware Trojan attacks.

Quick Read (beta)

loading the full paper ...