Abstract
Neural implicit surface representations have emerged as a promising paradigmto capture 3D shapes in a continuous and resolution-independent manner.However, adapting them to articulated shapes is non-trivial. Existingapproaches learn a backward warp field that maps deformed to canonical points.However, this is problematic since the backward warp field is pose dependentand thus requires large amounts of data to learn. To address this, we introduceSNARF, which combines the advantages of linear blend skinning (LBS) forpolygonal meshes with those of neural implicit surfaces by learning a forwarddeformation field without direct supervision. This deformation field is definedin canonical, pose-independent space, allowing for generalization to unseenposes. Learning the deformation field from posed meshes alone is challengingsince the correspondences of deformed points are defined implicitly and may notbe unique under changes of topology. We propose a forward skinning model thatfinds all canonical correspondences of any deformed point using iterative rootfinding. We derive analytical gradients via implicit differentiation, enablingend-to-end training from 3D meshes with bone transformations. Compared tostate-of-the-art neural implicit representations, our approach generalizesbetter to unseen poses while preserving accuracy. We demonstrate our method inchallenging scenarios on (clothed) 3D humans in diverse and unseen poses.