Abstract
In this paper, we introduce a Key-point-guided Diffusion probabilistic Model(KDM) that gains precise control over images by manipulating the object'skey-point. We propose a two-stage generative model incorporating an opticalflow map as an intermediate output. By doing so, a dense pixel-wiseunderstanding of the semantic relation between the image and sparse key pointis configured, leading to more realistic image generation. Additionally, theintegration of optical flow helps regulate the inter-frame variance ofsequential images, demonstrating an authentic sequential image generation. TheKDM is evaluated with diverse key-point conditioned image synthesis tasks,including facial image generation, human pose synthesis, and echocardiographyvideo prediction, demonstrating the KDM is proving consistency enhanced andphoto-realistic images compared with state-of-the-art models.