The Hidden Language of Diffusion Models

Abstract

Text-to-image diffusion models have demonstrated an unparalleled ability togenerate high-quality, diverse images from a textual concept (e.g., "a doctor","love"). However, the internal process of mapping text to a rich visualrepresentation remains an enigma. In this work, we tackle the challenge ofunderstanding concept representations in text-to-image models by decomposing aninput text prompt into a small set of interpretable elements. This is achievedby learning a pseudo-token that is a sparse weighted combination of tokens fromthe model's vocabulary, with the objective of reconstructing the imagesgenerated for the given concept. Applied over the state-of-the-art StableDiffusion model, this decomposition reveals non-trivial and surprisingstructures in the representations of concepts. For example, we find that someconcepts such as "a president" or "a composer" are dominated by specificinstances (e.g., "Obama", "Biden") and their interpolations. Other concepts,such as "happiness" combine associated terms that can be concrete ("family","laughter") or abstract ("friendship", "emotion"). In addition to peering intothe inner workings of Stable Diffusion, our method also enables applicationssuch as single-image decomposition to tokens, bias detection and mitigation,and semantic image manipulation. Our code will be available at:https://hila-chefer.github.io/Conceptor/

Quick Read (beta)

loading the full paper ...