Contrastive Learning for Unpaired Image-to-Image Translation

Abstract

In image-to-image translation, each patch in the output should reflect thecontent of the corresponding patch in the input, independent of domain. Wepropose a straightforward method for doing so -- maximizing mutual informationbetween the two, using a framework based on contrastive learning. The methodencourages two elements (corresponding patches) to map to a similar point in alearned feature space, relative to other elements (other patches) in thedataset, referred to as negatives. We explore several critical design choicesfor making contrastive learning effective in the image synthesis setting.Notably, we use a multilayer, patch-based approach, rather than operate onentire images. Furthermore, we draw negatives from within the input imageitself, rather than from the rest of the dataset. We demonstrate that ourframework enables one-sided translation in the unpaired image-to-imagetranslation setting, while improving quality and reducing training time. Inaddition, our method can even be extended to the training setting where each"domain" is only a single image.

Quick Read (beta)

loading the full paper ...