Skip to main content
Log in

Weight-sharing multi-stage multi-scale ensemble convolutional neural network

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Most of the existing convolutional neural networks (CNNs) ignore multi-scale features of input image to different extents. Thus they lack robustness to feature scale of the input image, which limits the generalization ability of the model. In addition, on the premise of large-scale data, in order to obtain higher image classification accuracy, CNNs generally require more layers and a huge amount of parameters, resulting in a higher cost of network training. To this end, a Weight-Sharing Multi-Stage Multi-Scale Ensemble Convolutional Neural Network (WSMSMSE-CNN) is proposed in this paper. The input image is pooled several times to obtain multi-scale images and sent to a multi-stage network. Each stage is a multi-layer multi-scale ensemble network consisting of Conv Block, Pooling layer and Dropout layer. Conv Blocks in the same stage are connected by pooling layers while those in different stage but at the same location share the same weights. In this way, multi-scale features of both the same image and scale features of multi-scale images are obtained. In addition, the large-sized convolutional kernel is replaced by a number of consecutive small-sized ones, which not only keep the receptive field unchanged, but also effectively control the number of parameters. Experimental results on CIFAR-10 and CIFAR-100 datasets verify that WSMSMSE-CNN not only has good robustness, but also requires fewer layers to obtain higher classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Islam MT, Aowal MA, Minhaz AT et al (2017) Abnormality detection and localization in chest X-rays using deep convolutional neural networks. arXiv preprint arXiv: 1705.09850

  2. Liang Z, Powell A, Ersoy I et al (2017) CNN-based image analysis for malaria diagnosis. In: Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, IEEE, pp 493–496

  3. Rajaraman S, Antani SK, Poostchi M et al (2018) Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 6:e4568. https://doi.org/10.7717/peerj.4568

    Article  Google Scholar 

  4. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2:1097–1105

    Google Scholar 

  5. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 770–778

  6. Huang G, Liu Z, Weinberger KQ et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 4700–4708

  7. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1–9

  8. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI, pp 4278–4284

  9. Huang G, Chen D, Li T et al (2017) Multi-scale dense convolutional networks for efficient prediction. arXiv preprint arXiv: 1703.09844

  10. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision, vol 8689, pp 818–833

  11. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2014) Object detectors emerge in deep scene cnns. arXiv preprint arXiv:1412.6856

  12. Szegedy C, Vanhoucke V, Ioffe S et al. (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 2818–2826

  13. Liao Z, Carneiro G (2015) Competitive multi-scale convolution. arXiv preprint arXiv:1511.05635

  14. Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: Proceedings of the 30th International Conference on Machine Learning. Princeton: International Machine Learning Society, pp 2356–2364

  15. Takahashi R, Matsubara T, Uehara K (2017) A novel weight-shared multi-stage network architecture of CNNs for scale invariance. arXiv preprint arXiv:1702.03505

  16. Tian J, Lu Y, Li L et al (2017) Tracking human poses in various scales with accurate appearance. Int J Mach Learn Cybern 8(5):1667–1680

    Article  Google Scholar 

  17. Bianco S, Buzzelli M, Mazzini D et al. (2015) Logo recognition using CNN features. In: International Conference on Image Analysis and Processing, vol 9280, pp 438–448

  18. Du C, Gao S (2017) Image segmentation-based multi-focus image fusion through multi-scale convolutional neural network. IEEE Access 5:15750–15761

    Article  Google Scholar 

  19. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift [C]. In: Proceedings of the 32nd International Conference on Machine Learning. Lille: International Machine Learning Society, vol 1, pp 448–456

  20. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp 807–814

  21. Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  22. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  23. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256

    Google Scholar 

  24. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  25. Corchs S, Fersini E, Gasparini F (2017) Ensemble learning on visual and textual data for social image emotion classification. Int J Mach Learn Cybern 4:1–14

    Google Scholar 

  26. Zhao H, Salloum S, Cai Y, Huang JZ (2015) Ensemble subspace clustering of text data using two-level features. Int J Mach Learn Cybern 8(6):1–16

    Google Scholar 

  27. Baumgartner D, Serpen G (2013) Performance of global–local hybrid ensemble versus boosting and bagging ensembles. Int J Mach Learn Cybern 4(4):301–317

    Article  Google Scholar 

  28. He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision. Springer International Publishing, pp 630–645

  29. Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648

  30. Smith LN, Topin N (2016) Deep convolutional neural network design patterns. arXiv preprint arXiv:1611.00847

  31. Jia Y, Shelhamer E, Donahue J et al. (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 675–678

  32. Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400

  33. Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, vol 38, pp 562–570

  34. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. Adv Neural Inf Process Syst 2015:2377–2385

    Google Scholar 

  35. Huang G, Sun Y, Liu Z et al (2016) Deep networks with stochastic depth. In: European Conference on Computer Vision. Springer International Publishing, pp 646–661

  36. Wen W, Wu C, Wang Y et al (2016) Learning structured sparsity in deep neural networks. In: Proceedings of advances in neural information processing systems, pp 2082–2090

  37. Targ S, Almeida D, Lyman K (2016) Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029

  38. Singh S, Hoiem D, Forsyth D (2016) Swapout: learning an ensemble of deep architectures. In: Proceedings of advances in neural information processing systems, pp 28–36

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61772532, 61472424).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuhu Cheng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Bao, A., Cheng, Y. et al. Weight-sharing multi-stage multi-scale ensemble convolutional neural network. Int. J. Mach. Learn. & Cyber. 10, 1631–1642 (2019). https://doi.org/10.1007/s13042-018-0842-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-018-0842-5

Keywords

Navigation