CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

  • 2022-06-09 10:50:39
  • Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh
We propose CLIP-Actor, a text-driven motion recommendation and neural meshstylization system for human mesh animation. CLIP-Actor animates a 3D humanmesh to conform to a text prompt by recommending a motion sequence and learningmesh style attributes. Prior work fails to generate plausible results when theartist-designed mesh content does not conform to the text from the beginning.Instead, we build a text-driven human motion recommendation system byleveraging a large-scale human motion dataset with language labels. Given anatural language prompt, CLIP-Actor first suggests a human motion that conformsto the prompt in a coarse-to-fine manner. Then, we propose asynthesize-through-optimization method that detailizes and texturizes arecommended mesh sequence in a disentangled way from the pose of each frame. Itallows the style attribute to conform to the prompt in a temporally-consistentand pose-agnostic manner. The decoupled neural optimization also enablesspatio-temporal view augmentation from multi-frame human motion. We furtherpropose the mask-weighted embedding attention, which stabilizes theoptimization process by rejecting distracting renders containing scarceforeground pixels. We demonstrate that CLIP-Actor produces plausible andhuman-recognizable style 3D human mesh in motion with detailed geometry andtexture from a natural language prompt.


