DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Abstract

We address the task of generating 3D hair geometry from a single image, whichis challenging due to the diversity of hairstyles and the lack of pairedimage-to-3D hair data. Previous methods are primarily trained on synthetic dataand cope with the limited amount of such data by using low-dimensionalintermediate representations, such as guide strands and scalp-level embeddings,that require post-processing to decode, upsample, and add realism. Theseapproaches fail to reconstruct detailed hair, struggle with curly hair, or arelimited to handling only a few hairstyles. To overcome these limitations, wepropose DiffLocks, a novel framework that enables detailed reconstruction of awide variety of hairstyles directly from a single image. First, we address thelack of 3D hair data by automating the creation of the largest synthetic hairdataset to date, containing 40K hairstyles. Second, we leverage the synthetichair dataset to learn an image-conditioned diffusion-transfomer model thatgenerates accurate 3D strands from a single frontal image. By using apretrained image backbone, our method generalizes to in-the-wild images despitebeing trained only on synthetic data. Our diffusion model predicts a scalptexture map in which any point in the map contains the latent code for anindividual hair strand. These codes are directly decoded to 3D strands withoutpost-processing techniques. Representing individual strands, instead of guidestrands, enables the transformer to model the detailed spatial structure ofcomplex hairstyles. With this, DiffLocks can recover highly curled hair, likeafro hairstyles, from a single image for the first time. Data and code isavailable at https://radualexandru.github.io/difflocks/

Quick Read (beta)

loading the full paper ...