Non-deep Networks - Paper Detail

Abstract

Depth is the hallmark of deep neural networks. But more depth means moresequential computation and higher latency. This begs the question -- is itpossible to build high-performing "non-deep" neural networks? We show that itis. To do so, we use parallel subnetworks instead of stacking one layer afteranother. This helps effectively reduce depth while maintaining highperformance. By utilizing parallel substructures, we show, for the first time,that a network with a depth of just 12 can achieve top-1 accuracy over 80% onImageNet, 96% on CIFAR10, and 81% on CIFAR100. We also show that a network witha low-depth (12) backbone can achieve an AP of 48% on MS-COCO. We analyze thescaling rules for our design and show how to increase performance withoutchanging the network's depth. Finally, we provide a proof of concept for hownon-deep networks could be used to build low-latency recognition systems. Codeis available at https://github.com/imankgoyal/NonDeepNetworks.

Quick Read (beta)

loading the full paper ...