Abstract
Accurate depth estimation from images is a fundamental task in manyapplications including scene understanding and reconstruction. Existingsolutions for depth estimation often produce blurry approximations of lowresolution. This paper presents a convolutional neural network for computing ahigh-resolution depth map given a single RGB image with the help of transferlearning. Following a standard encoder-decoder architecture, we leveragefeatures extracted using high performing pre-trained networks when initializingour encoder along with augmentation and training strategies that lead to moreaccurate results. We show how, even for a very simple decoder, our method isable to achieve detailed high-resolution depth maps. Our network, with fewerparameters and training iterations, outperforms state-of-the-art on twodatasets and also produces qualitatively better results that capture objectboundaries more faithfully. Code and corresponding pre-trained weights are madepublicly available.