Understanding the Role of Individual Units in a Deep Neural Network

Abstract

Deep neural networks excel at finding hierarchical representations that solvecomplex tasks over large data sets. How can we humans understand these learnedrepresentations? In this work, we present network dissection, an analyticframework to systematically identify the semantics of individual hidden unitswithin image classification and image generation networks. First, we analyze aconvolutional neural network (CNN) trained on scene classification and discoverunits that match a diverse set of object concepts. We find evidence that thenetwork has learned many object classes that play crucial roles in classifyingscene classes. Second, we use a similar analytic method to analyze a generativeadversarial network (GAN) model trained to generate scenes. By analyzingchanges made when small sets of units are activated or deactivated, we findthat objects can be added and removed from the output scenes while adapting tothe context. Finally, we apply our analytic framework to understandingadversarial attacks and to semantic image editing.

Quick Read (beta)

loading the full paper ...