Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks

Abstract

In an effort to understand the meaning of the intermediate representationscaptured by deep networks, recent papers have tried to associate specificsemantic concepts to individual neural network filter responses, whereinteresting correlations are often found, largely by focusing on extremalfilter responses. In this paper, we show that this approach can favoreasy-to-interpret cases that are not necessarily representative of the averagebehavior of a representation. A more realistic but harder-to-study hypothesis is that semanticrepresentations are distributed, and thus filters must be studied inconjunction. In order to investigate this idea while enabling systematicvisualization and quantification of multiple filter responses, we introduce theNet2Vec framework, in which semantic concepts are mapped to vectorialembeddings based on corresponding filter responses. By studying suchembeddings, we are able to show that 1., in most cases, multiple filters arerequired to code for a concept, that 2., often filters are not concept specificand help encode multiple concepts, and that 3., compared to single filteractivations, filter embeddings are able to better characterize the meaning of arepresentation and its relationship to other concepts.

Quick Read (beta)

loading the full paper ...