CubeNet: Equivariance to 3D Rotation and Translation

Abstract

3D Convolutional Neural Networks are sensitive to transformations applied totheir input. This is a problem because a voxelized version of a 3D object, andits rotated clone, will look unrelated to each other after passing through tothe last layer of a network. Instead, an idealized model would preserve ameaningful representation of the voxelized object, while explaining thepose-difference between the two inputs. An equivariant representation vectorhas two components: the invariant identity part, and a discernable encoding ofthe transformation. Models that can't explain pose-differences risk "diluting"the representation, in pursuit of optimizing a classification or regressionloss function. We introduce a Group Convolutional Neural Network with linear equivariance totranslations and right angle rotations in three dimensions. We call thisnetwork CubeNet, reflecting its cube-like symmetry. By construction, thisnetwork helps preserve a 3D shape's global and local signature, as it istransformed through successive layers. We apply this network to a variety of 3Dinference problems, achieving state-of-the-art on the ModelNet10 classificationchallenge, and comparable performance on the ISBI 2012 Connectome SegmentationBenchmark. To the best of our knowledge, this is the first 3D rotationequivariant CNN for voxel representations.

Quick Read (beta)

loading the full paper ...