We present a novel approach for image-animation of a source image by adriving video, both depicting the same type of object. We do not assume theexistence of pose models and our method is able to animate arbitrary objectswithout knowledge of the object's structure. Furthermore, both the drivingvideo and the course image are only seen during test-time. Our method is basedon a shared mask generator, which separates the foreground object from itsbackground, and captures the object's general pose and shape. A mask-refinementmodule then replaces, in the mask extracted from the driver image, the identityof the driver with the identity of the source. Conditioned on the source image,the transformed mask is then decoded by a multi-scale generator that renders arealistic image, in which the content of the source frame is animated by thepose in the driving video. Due to lack of fully supervised data, we train onthe task of reconstructing frames from the same video the source image is takenfrom. In order to control source of the identity of the output frame, we employduring training perturbations that remove the unwanted identity information.Our method is shown to greatly outperform the state of the art methods onmultiple benchmarks. Our code and samples are available athttps://github.com/itsyoavshalev/Image-Animation-with-Perturbed-Masks.