Abstract
3D Convolutional Neural Networks are sensitive to transformations applied totheir input. This is a problem because a voxelized version of a 3D object, andits rotated clone, will look unrelated to each other after passing through tothe last layer of a network. Instead, an idealized model would preserve ameaningful representation of the voxelized object, while explaining thepose-difference between the two inputs. An equivariant representation vectorhas two components: the invariant identity part, and a discernable encoding ofthe transformation. Models that can't explain pose-differences risk "diluting"the representation, in pursuit of optimizing a classification or regressionloss function. We introduce a Group Convolutional Neural Network with linear equivariance totranslations and right angle rotations in three dimensions. We call thisnetwork CubeNet, reflecting its cube-like symmetry. By construction, thisnetwork helps preserve a 3D shape's global and local signature, as it istransformed through successive layers. We apply this network to a variety of 3Dinference problems, achieving state-of-the-art on the ModelNet10 classificationchallenge, and comparable performance on the ISBI 2012 Connectome SegmentationBenchmark. To the best of our knowledge, this is the first 3D rotationequivariant CNN for voxel representations.