Image colorization is inherently an ill-posed problem with multi-modaluncertainty. Previous methods leverage the deep neural network to map inputgrayscale images to plausible color outputs directly. Although theselearning-based methods have shown impressive performance, they usually fail onthe input images that contain multiple objects. The leading cause is thatexisting models perform learning and colorization on the entire image. In theabsence of a clear figure-ground separation, these models cannot effectivelylocate and learn meaningful object-level semantics. In this paper, we propose amethod for achieving instance-aware colorization. Our network architectureleverages an off-the-shelf object detector to obtain cropped object images anduses an instance colorization network to extract object-level features. We usea similar network to extract the full-image features and apply a fusion moduleto full object-level and image-level features to predict the final colors. Bothcolorization networks and fusion modules are learned from a large-scaledataset. Experimental results show that our work outperforms existing methodson different quality metrics and achieves state-of-the-art performance on imagecolorization.