DreamFusion: Text-to-3D using 2D Diffusion

  • 2022-09-29 18:50:40
  • Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
  • 27

Abstract

Recent breakthroughs in text-to-image synthesis have been driven by diffusionmodels trained on billions of image-text pairs. Adapting this approach to 3Dsynthesis would require large-scale datasets of labeled 3D data and efficientarchitectures for denoising 3D data, neither of which currently exist. In thiswork, we circumvent these limitations by using a pretrained 2D text-to-imagediffusion model to perform text-to-3D synthesis. We introduce a loss based onprobability density distillation that enables the use of a 2D diffusion modelas a prior for optimization of a parametric image generator. Using this loss ina DeepDream-like procedure, we optimize a randomly-initialized 3D model (aNeural Radiance Field, or NeRF) via gradient descent such that its 2Drenderings from random angles achieve a low loss. The resulting 3D model of thegiven text can be viewed from any angle, relit by arbitrary illumination, orcomposited into any 3D environment. Our approach requires no 3D training dataand no modifications to the image diffusion model, demonstrating theeffectiveness of pretrained image diffusion models as priors.

 

Quick Read (beta)

loading the full paper ...