Language-Guided Face Animation by Recurrent StyleGAN-based Generator

Abstract

Recent works on language-guided image manipulation have shown great power oflanguage in providing rich semantics, especially for face images. However, theother natural information, motions, in language is less explored. In thispaper, we leverage the motion information and study a novel task,language-guided face animation, that aims to animate a static face image withthe help of languages. To better utilize both semantics and motions fromlanguages, we propose a simple yet effective framework. Specifically, wepropose a recurrent motion generator to extract a series of semantic and motioninformation from the language and feed it along with visual information to apre-trained StyleGAN to generate high-quality frames. To optimize the proposedframework, three carefully designed loss functions are proposed including aregularization loss to keep the face identity, a path length regularizationloss to ensure motion smoothness, and a contrastive loss to enable videosynthesis with various language guidance in one single model. Extensiveexperiments with both qualitative and quantitative evaluations on diversedomains (\textit{e.g.,} human face, anime face, and dog face) demonstrate thesuperiority of our model in generating high-quality and realistic videos fromone still image with the guidance of language. Code will be available athttps://github.com/TiankaiHang/language-guided-animation.git.

Quick Read (beta)

loading the full paper ...