Abstract
Past few years have witnessed an exponential growth of interest in deep learning methodologies with rapidly improving accuracy and reduced computational complexity. In particular, architectures using Convolutional Neural Networks (CNNs) have produced state-of-the-art performances for image classification and object recognition tasks. Recently, Capsule Networks (CapsNets) achieved a significant increase in performance by addressing an inherent limitation of CNNs in encoding pose and deformation. Inspired by such an advancement, we propose Multi-level Dense Capsule Networks (multi-level DCNets). The proposed framework customizes CapsNet by adding multi-level capsules and replacing the standard convolutional layers with densely connected convolutions. A single-level DCNet essentially adds a deeper convolution network, which leads to learning of discriminative feature maps learned by different layers to form the primary capsules. Additionally, multi-level capsule networks uses a hierarchical architecture to learn new capsules from former capsules that represent spatial information in a fine-to-coarser manner, which makes it more efficient for learning complex data. Experiments on image classification task using benchmark datasets demonstrate the efficacy of the proposed architectures. DCNet achieves state-of-the-art performance (99.75%) on the MNIST dataset with approximately twenty-fold decrease in total training iterations, over the conventional CapsNet. Furthermore, multi-level DCNet performs better than CapsNet on SVHN dataset (96.90%), and outperforms the ensemble of seven CapsNet models on CIFAR-10 by \(+\)0.31% with seven-fold decrease in the number of parameters. Source codes, models and figures are available at https://github.com/ssrp/Multi-level-DCNet.
S. S. R. Phaye and A. Sikka—Equal first authors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Afshar, P., Mohammadi, A., Plataniotis, K.N.: Brain tumor type classification via capsule networks. arXiv preprint arXiv:1802.10200 (2018)
Burt, P.J., Adelson, E.H.: The laplacian pyramid as a compact image code. In: Readings in computer vision: issues, problems, principles, and paradigms, pp. 671–679. Morgan Kaufmann Publishers Inc., San Francisco (1987). http://dl.acm.org/citation.cfm?id=33517.33571
Cheng, J., et al.: Enhanced performance of brain tumor classification via tumor region augmentation and partition. PLoS ONE 10(10), e0140381 (2015)
Donahue, J., et al.: Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 3 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Larsson, G., Maire, M., Shakhnarovich, G.: Fractalnet: Ultra-deep neural networks without residuals. CoRR abs/1605.07648 (2016). http://arxiv.org/abs/1605.07648
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011)
Rotshtein, P., Vuilleumier, P., Winston, J., Driver, J., Dolan, R.: Distinct and convergent visual processing of high and low spatial frequency information in faces. Cereb. Cortex 17(11), 2713–2724 (2007)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: The all convolutional net. CoRR abs/1412.6806 (2014). http://dblp.uni-trier.de/db/journals/corr/corr1412.html#SpringenbergDBR14
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. CoRR abs/1505.00387 (2015). http://arxiv.org/abs/1505.00387
Szegedy, C., et al.: Going deeper with convolutions. CoRR abs/1409.4842 (2014). http://arxiv.org/abs/1409.4842
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1701–1708. IEEE Computer Society, Washington, DC (2014). https://doi.org/10.1109/CVPR.2014.220
titu1994: Dense net in keras (2017). https://github.com/titu1994/DenseNet
Xi, E., Bing, S., Jin, Y.: Capsule network performance on complex data. arXiv preprint arXiv:1712.03480 (2017)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
XifengGuo: CapsNet-Keras (2017). https://github.com/XifengGuo/CapsNet-Keras
Yang, J.Z., Feng, Y., Feng, Q., Chen, W.: Retrieval of brain tumors by adaptive spatial pooling and fisher vector representation. PLoS ONE 11(6), e0157112 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Phaye, S.S.R., Sikka, A., Dhall, A., Bathula, D.R. (2019). Multi-level Dense Capsule Networks. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-20873-8_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20872-1
Online ISBN: 978-3-030-20873-8
eBook Packages: Computer ScienceComputer Science (R0)