Abstract
Most of the existing convolutional neural networks (CNNs) ignore multi-scale features of input image to different extents. Thus they lack robustness to feature scale of the input image, which limits the generalization ability of the model. In addition, on the premise of large-scale data, in order to obtain higher image classification accuracy, CNNs generally require more layers and a huge amount of parameters, resulting in a higher cost of network training. To this end, a Weight-Sharing Multi-Stage Multi-Scale Ensemble Convolutional Neural Network (WSMSMSE-CNN) is proposed in this paper. The input image is pooled several times to obtain multi-scale images and sent to a multi-stage network. Each stage is a multi-layer multi-scale ensemble network consisting of Conv Block, Pooling layer and Dropout layer. Conv Blocks in the same stage are connected by pooling layers while those in different stage but at the same location share the same weights. In this way, multi-scale features of both the same image and scale features of multi-scale images are obtained. In addition, the large-sized convolutional kernel is replaced by a number of consecutive small-sized ones, which not only keep the receptive field unchanged, but also effectively control the number of parameters. Experimental results on CIFAR-10 and CIFAR-100 datasets verify that WSMSMSE-CNN not only has good robustness, but also requires fewer layers to obtain higher classification accuracy.
Similar content being viewed by others
References
Islam MT, Aowal MA, Minhaz AT et al (2017) Abnormality detection and localization in chest X-rays using deep convolutional neural networks. arXiv preprint arXiv: 1705.09850
Liang Z, Powell A, Ersoy I et al (2017) CNN-based image analysis for malaria diagnosis. In: Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, IEEE, pp 493–496
Rajaraman S, Antani SK, Poostchi M et al (2018) Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 6:e4568. https://doi.org/10.7717/peerj.4568
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2:1097–1105
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 770–778
Huang G, Liu Z, Weinberger KQ et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 4700–4708
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1–9
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI, pp 4278–4284
Huang G, Chen D, Li T et al (2017) Multi-scale dense convolutional networks for efficient prediction. arXiv preprint arXiv: 1703.09844
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision, vol 8689, pp 818–833
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2014) Object detectors emerge in deep scene cnns. arXiv preprint arXiv:1412.6856
Szegedy C, Vanhoucke V, Ioffe S et al. (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 2818–2826
Liao Z, Carneiro G (2015) Competitive multi-scale convolution. arXiv preprint arXiv:1511.05635
Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: Proceedings of the 30th International Conference on Machine Learning. Princeton: International Machine Learning Society, pp 2356–2364
Takahashi R, Matsubara T, Uehara K (2017) A novel weight-shared multi-stage network architecture of CNNs for scale invariance. arXiv preprint arXiv:1702.03505
Tian J, Lu Y, Li L et al (2017) Tracking human poses in various scales with accurate appearance. Int J Mach Learn Cybern 8(5):1667–1680
Bianco S, Buzzelli M, Mazzini D et al. (2015) Logo recognition using CNN features. In: International Conference on Image Analysis and Processing, vol 9280, pp 438–448
Du C, Gao S (2017) Image segmentation-based multi-focus image fusion through multi-scale convolutional neural network. IEEE Access 5:15750–15761
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift [C]. In: Proceedings of the 32nd International Conference on Machine Learning. Lille: International Machine Learning Society, vol 1, pp 448–456
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp 807–814
Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Corchs S, Fersini E, Gasparini F (2017) Ensemble learning on visual and textual data for social image emotion classification. Int J Mach Learn Cybern 4:1–14
Zhao H, Salloum S, Cai Y, Huang JZ (2015) Ensemble subspace clustering of text data using two-level features. Int J Mach Learn Cybern 8(6):1–16
Baumgartner D, Serpen G (2013) Performance of global–local hybrid ensemble versus boosting and bagging ensembles. Int J Mach Learn Cybern 4(4):301–317
He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision. Springer International Publishing, pp 630–645
Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648
Smith LN, Topin N (2016) Deep convolutional neural network design patterns. arXiv preprint arXiv:1611.00847
Jia Y, Shelhamer E, Donahue J et al. (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 675–678
Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400
Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, vol 38, pp 562–570
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. Adv Neural Inf Process Syst 2015:2377–2385
Huang G, Sun Y, Liu Z et al (2016) Deep networks with stochastic depth. In: European Conference on Computer Vision. Springer International Publishing, pp 646–661
Wen W, Wu C, Wang Y et al (2016) Learning structured sparsity in deep neural networks. In: Proceedings of advances in neural information processing systems, pp 2082–2090
Targ S, Almeida D, Lyman K (2016) Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029
Singh S, Hoiem D, Forsyth D (2016) Swapout: learning an ensemble of deep architectures. In: Proceedings of advances in neural information processing systems, pp 28–36
Acknowledgements
This work is supported by the National Natural Science Foundation of China (61772532, 61472424).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, X., Bao, A., Cheng, Y. et al. Weight-sharing multi-stage multi-scale ensemble convolutional neural network. Int. J. Mach. Learn. & Cyber. 10, 1631–1642 (2019). https://doi.org/10.1007/s13042-018-0842-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-018-0842-5