Abstract
We propose Neural 3D Articulation Prior (NAP), the first 3D deep generativemodel to synthesize 3D articulated object models. Despite the extensiveresearch on generating 3D objects, compositions, or scenes, there remains alack of focus on capturing the distribution of articulated objects, a commonobject category for human and robot interaction. To generate articulatedobjects, we first design a novel articulation tree/graph parameterization andthen apply a diffusion-denoising probabilistic model over this representationwhere articulated objects can be generated via denoising from random completegraphs. In order to capture both the geometry and the motion structure whosedistribution will affect each other, we design a graph-attention denoisingnetwork for learning the reverse diffusion process. We propose a novel distancethat adapts widely used 3D generation metrics to our novel task to evaluategeneration quality, and experiments demonstrate our high performance inarticulated object generation. We also demonstrate several conditionedgeneration applications, including Part2Motion, PartNet-Imagination,Motion2Part, and GAPart2Object.