We investigate the problem of automatically determining what type of shoeleft an impression found at a crime scene. This recognition problem is madedifficult by the variability in types of crime scene evidence (ranging fromtraces of dust or oil on hard surfaces to impressions made in soil) and thelack of comprehensive databases of shoe outsole tread patterns. We find thatmid-level features extracted by pre-trained convolutional neural nets aresurprisingly effective descriptors for this specialized domains. However, thechoice of similarity measure for matching exemplars to a query image isessential to good performance. For matching multi-channel deep features, wepropose the use of multi-channel normalized cross-correlation and analyze itseffectiveness. Our proposed metric significantly improves performance inmatching crime scene shoeprints to laboratory test impressions. We also showits effectiveness in other cross-domain image retrieval problems: matchingfacade images to segmentation labels and aerial photos to map images. Finally,we introduce a discriminatively trained variant and fine-tune our systemthrough our proposed metric, obtaining state-of-the-art performance.