StyleDrop: Text-to-Image Generation in Any Style

  • 2023-06-01 18:59:51
  • Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan
  • 0


Pre-trained large text-to-image models synthesize impressive images with anappropriate use of text prompts. However, ambiguities inherent in naturallanguage and out-of-distribution effects make it hard to synthesize imagestyles, that leverage a specific design pattern, texture or material. In thispaper, we introduce StyleDrop, a method that enables the synthesis of imagesthat faithfully follow a specific style using a text-to-image model. Theproposed method is extremely versatile and captures nuances and details of auser-provided style, such as color schemes, shading, design patterns, and localand global effects. It efficiently learns a new style by fine-tuning very fewtrainable parameters (less than $1\%$ of total model parameters) and improvingthe quality via iterative training with either human or automated feedback.Better yet, StyleDrop is able to deliver impressive results even when the usersupplies only a single image that specifies the desired style. An extensivestudy shows that, for the task of style tuning text-to-image models, StyleDropimplemented on Muse convincingly outperforms other methods, includingDreamBooth and textual inversion on Imagen or Stable Diffusion. More resultsare available at our project website:


