Scaling Properties of Diffusion Models for Perceptual Tasks

  • 2024-11-13 18:59:44
  • Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran, Jitendra Malik
  • 0

Abstract

In this paper, we argue that iterative computation with diffusion modelsoffers a powerful paradigm for not only generation but also visual perceptiontasks. We unify tasks such as depth estimation, optical flow, and amodalsegmentation under the framework of image-to-image translation, and show howdiffusion models benefit from scaling training and test-time compute for theseperceptual tasks. Through a careful analysis of these scaling properties, weformulate compute-optimal training and inference recipes to scale diffusionmodels for visual perception tasks. Our models achieve competitive performanceto state-of-the-art methods using significantly less data and compute. Toaccess our code and models, see https://scaling-diffusion-perception.github.io .

 

Quick Read (beta)

loading the full paper ...