Diffusion-HPC: Generating Synthetic Images with Realistic Humans

Abstract

Recent text-to-image generative models have exhibited remarkable abilities ingenerating high-fidelity and photo-realistic images. However, despite thevisually impressive results, these models often struggle to preserve plausiblehuman structure in the generations. Due to this reason, while generative modelshave shown promising results in aiding downstream image recognition tasks bygenerating large volumes of synthetic data, they remain infeasible forimproving downstream human pose perception and understanding. In this work, wepropose Diffusion model with Human Pose Correction (Diffusion HPC), atext-conditioned method that generates photo-realistic images with plausibleposed humans by injecting prior knowledge about human body structure. We showthat Diffusion HPC effectively improves the realism of human generations.Furthermore, as the generations are accompanied by 3D meshes that serve asground truths, Diffusion HPC's generated image-mesh pairs are well-suited fordownstream human mesh recovery task, where a shortage of 3D training data haslong been an issue.

Quick Read (beta)

loading the full paper ...