UnZipLoRA: Separating Content and Style from a Single Image

Abstract

This paper introduces UnZipLoRA, a method for decomposing an image into itsconstituent subject and style, represented as two distinct LoRAs (Low-RankAdaptations). Unlike existing personalization techniques that focus on eithersubject or style in isolation, or require separate training sets for each,UnZipLoRA disentangles these elements from a single image by training both theLoRAs simultaneously. UnZipLoRA ensures that the resulting LoRAs arecompatible, i.e., they can be seamlessly combined using direct addition.UnZipLoRA enables independent manipulation and recontextualization of subjectand style, including generating variations of each, applying the extractedstyle to new subjects, and recombining them to reconstruct the original imageor create novel variations. To address the challenge of subject and styleentanglement, UnZipLoRA employs a novel prompt separation technique, as well ascolumn and block separation strategies to accurately preserve thecharacteristics of subject and style, and ensure compatibility between thelearned LoRAs. Evaluation with human studies and quantitative metricsdemonstrates UnZipLoRA's effectiveness compared to other state-of-the-artmethods, including DreamBooth-LoRA, Inspiration Tree, and B-LoRA.

Quick Read (beta)

loading the full paper ...