Abstract
Convolutional Neural Networks (CNN) have been pivotal to the success of manystate-of-the-art classification problems, in a wide variety of domains (fore.g. vision, speech, graphs and medical imaging). A commonality within thosedomains is the presence of hierarchical, spatially agglomerativelocal-to-global interactions within the data. For two-dimensional images, suchinteractions may induce an a priori relationship between the pixel data and theunderlying spatial ordering of the pixels. For instance in natural images,neighboring pixels are more likely contain similar values than non-neighboringpixels which are further apart. To that end, we propose a statistical metriccalled spatial orderness, which quantifies the extent to which the input data(2D) obeys the underlying spatial ordering at various scales. In ourexperiments, we mainly find that adding convolutional layers to a CNN could becounterproductive for data bereft of spatial order at higher scales. We alsoobserve, quite counter-intuitively, that the spatial orderness of CNN featuremaps show a synchronized increase during the intial stages of training, andvalidation performance only improves after spatial orderness of feature mapsstart decreasing. Lastly, we present a theoretical analysis (and empiricalvalidation) of the spatial orderness of network weights, where we find thatusing smaller kernel sizes leads to kernels of greater spatial orderness andvice-versa.