In this paper, we present DeepSIM, a generative model for conditional imagemanipulation based on a single image. We find that extensive augmentation iskey for enabling single image training, and incorporate the use ofthin-plate-spline (TPS) as an effective augmentation. Our network learns to mapbetween a primitive representation of the image to the image itself. The choiceof a primitive representation has an impact on the ease and expressiveness ofthe manipulations and can be automatic (e.g. edges), manual (e.g. segmentation)or hybrid such as edges on top of segmentations. At manipulation time, ourgenerator allows for making complex image changes by modifying the primitiveinput representation and mapping it through the network. Our method is shown toachieve remarkable performance on image manipulation tasks.