Abstract
Cut-and-paste methods take an object from one image and insert it intoanother. Doing so often results in unrealistic looking images because theinserted object's shading is inconsistent with the target scene's shading.Existing reshading methods require a geometric and physical model of theinserted object, which is then rendered using environment parameters.Accurately constructing such a model only from a single image is beyond thecurrent understanding of computer vision. We describe an alternative procedure-- cut-and-paste neural rendering, to render the inserted fragment's shadingfield consistent with the target scene. We use a Deep Image Prior (DIP) as aneural renderer trained to render an image with consistent image decompositioninferences. The resulting rendering from DIP should have an albedo consistentwith composite albedo; it should have a shading field that, outside theinserted fragment, is the same as the target scene's shading field; andcomposite surface normals are consistent with the final rendering's shadingfield. The result is a simple procedure that produces convincing and realisticshading. Moreover, our procedure does not require rendered images orimage-decomposition from real images in the training or labeled annotations. Infact, our only use of simulated ground truth is our use of a pre-trained normalestimator. Qualitative results are strong, supported by a user study comparingagainst the state-of-the-art image harmonization baseline.