Motion Representations for Articulated Animation

Abstract

We propose novel motion representations for animating articulated objectsconsisting of distinct parts. In a completely unsupervised manner, our methodidentifies object parts, tracks them in a driving video, and infers theirmotions by considering their principal axes. In contrast to the previouskeypoint-based works, our method extracts meaningful and consistent regions,describing locations, shape, and pose. The regions correspond to semanticallyrelevant and distinct object parts, that are more easily detected in frames ofthe driving video. To force decoupling of foreground from background, we modelnon-object related global motion with an additional affine transformation. Tofacilitate animation and prevent the leakage of the shape of the drivingobject, we disentangle shape and pose of objects in the region space. Our modelcan animate a variety of objects, surpassing previous methods by a large marginon existing benchmarks. We present a challenging new benchmark withhigh-resolution videos and show that the improvement is particularly pronouncedwhen articulated objects are considered, reaching 96.6% user preference vs. thestate of the art.

Quick Read (beta)

loading the full paper ...