Abstract
Past few years have witnessed exponential growth of interest in deep learningmethodologies with rapidly improving accuracies and reduced computationalcomplexity. In particular, architectures using Convolutional Neural Networks(CNNs) have produced state-of-the-art performances for image classification andobject recognition tasks. Recently, Capsule Networks (CapsNet) achievedsignificant increase in performance by addressing an inherent limitation ofCNNs in encoding pose and deformation. Inspired by such advancement, we askedourselves, can we do better? We propose Dense Capsule Networks (DCNet) andDiverse Capsule Networks (DCNet++). The two proposed frameworks customize theCapsNet by replacing the standard convolutional layers with densely connectedconvolutions. This helps in incorporating feature maps learned by differentlayers in forming the primary capsules. DCNet, essentially adds a deeperconvolution network, which leads to learning of discriminative feature maps.Additionally, DCNet++ uses a hierarchical architecture to learn capsules thatrepresent spatial information in a fine-to-coarser manner, which makes it moreefficient for learning complex data. Experiments on image classification taskusing benchmark datasets demonstrate the efficacy of the proposedarchitectures. DCNet achieves state-of-the-art performance (99.75%) on MNISTdataset with twenty fold decrease in total training iterations, over theconventional CapsNet. Furthermore, DCNet++ performs better than CapsNet on SVHNdataset (96.90%), and outperforms the ensemble of seven CapsNet models onCIFAR-10 by 0.31% with seven fold decrease in number of parameters.