JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

  • 2025-05-25 15:32:24
  • Eric Tillmann Bill, Enis Simsar, Thomas Hofmann
  • 0

Abstract

We introduce JEDI, a test-time adaptation method that enhances subjectseparation and compositional alignment in diffusion models without requiringretraining or external supervision. JEDI operates by minimizing semanticentanglement in attention maps using a novel Jensen-Shannon divergence basedobjective. To improve efficiency, we leverage adversarial optimization,reducing the number of updating steps required. JEDI is model-agnostic and applicable to architectures such as StableDiffusion 1.5 and 3.5, consistently improving prompt alignment anddisentanglement in complex scenes. Additionally, JEDI provides a lightweight,CLIP-free disentanglement score derived from internal attention distributions,offering a principled benchmark for compositional alignment under test-timeconditions. We will publicly release the implementation of our method.

 

Quick Read (beta)

loading the full paper ...