YOLO Nano: a Highly Compact You Only Look Once Convolutional Neural Network for Object Detection

Abstract

Object detection remains an active area of research in the field of computervision, and considerable advances and successes has been achieved in this areathrough the design of deep convolutional neural networks for tackling objectdetection. Despite these successes, one of the biggest challenges to widespreaddeployment of such object detection networks on edge and mobile scenarios isthe high computational and memory requirements. As such, there has been growingresearch interest in the design of efficient deep neural network architecturescatered for edge and mobile usage. In this study, we introduce YOLO Nano, ahighly compact deep convolutional neural network for the task of objectdetection. A human-machine collaborative design strategy is leveraged to createYOLO Nano, where principled network design prototyping, based on designprinciples from the YOLO family of single-shot object detection networkarchitectures, is coupled with machine-driven design exploration to create acompact network with highly customized module-level macroarchitecture andmicroarchitecture designs tailored for the task of embedded object detection.The proposed YOLO Nano possesses a model size of ~4.0MB (>15.1x and >8.3xsmaller than Tiny YOLOv2 and Tiny YOLOv3, respectively) and requires 4.57Boperations for inference (>34% and ~17% lower than Tiny YOLOv2 and Tiny YOLOv3,respectively) while still achieving an mAP of ~69.1% on the VOC 2007 dataset(~12% and ~10.7% higher than Tiny YOLOv2 and Tiny YOLOv3, respectively).Experiments on inference speed and power efficiency on a Jetson AGX Xavierembedded module at different power budgets further demonstrate the efficacy ofYOLO Nano for embedded scenarios.

Quick Read (beta)

loading the full paper ...