Parameters Sharing in Residual Neural Networks

Dai, Dawei; Yu, Liping; Wei, Hui

doi:10.1007/s11063-019-10143-4

Parameters Sharing in Residual Neural Networks

Published: 13 November 2019

Volume 51, pages 1393–1410, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

868 Accesses
11 Citations
3 Altmetric
Explore all metrics

Abstract

Deep neural networks (DNN) have achieved great success in machine learning due to their powerful ability to learn and present knowledge. However, models of such DNN often have massive trainable parameters, which lead to big resource burden in practice. As a result, reducing the amount of parameters and preserving its competitive performance are always critical tasks in the field of DNN. In this paper, we focused on one type of convolution neural network that has many repeated or same-structure convolutional layers. Residual net and its variants are widely used, making the deeper model easy to train. One type block of such a model contains two convolutional layers, and each block commonly has two trainable parameter layers. However, we used only one layer of trainable parameters in the block, which means that the two convolutional layers in one block shared one layer of trainable parameters. We performed extensive experiments for different architectures of the Residual Net with trainable parameter sharing on the CIFAR-10, CIFAR-100, and ImageNet datasets. We found that the model with trainable parameter sharing can obtain fewer errors on the training datasets and had a very close recognition accuracy (within 0.5%), compared to the original models. The parameters of the new model were reduced by more than 1/3 of the total of the original.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Developing a Hybrid Network Architecture for Deep Convolutional Neural Networks

Sharing ConvNet Across Heterogeneous Tasks

Study of Residual Networks for Image Recognition

References

Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. ArXiv preprint arXiv:1202.2745
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 580–587)
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. ArXiv preprint arXiv:1704.02685
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833. Springer, Cham
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7):e0130140
Article Google Scholar
Zintgraf LM, Cohen TS, Adel T, Welling M (2017) Visualizing deep neural network decisions: prediction difference analysis. ArXiv preprint arXiv:1702.04595
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv preprint arXiv:1409.1556
Zagoruyko S, Komodakis N (2016) Wide residual networks. ArXiv preprint arXiv:1605.07146
Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H (2015) Understanding neural networks through deep visualization. ArXiv preprint arXiv:1506.06579
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: 2017 IEEE international conference on computer vision (ICCV), pp 2755–2763. IEEE
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Krizhevsky A, Nair V, Hinton G (2010) Cifar-10 (Canadian Institute for Advanced Research). http://www.cs.toronto.edu/kriz/cifar.html
Chrabaszcz P, Loshchilov I, Hutter F (2017) A downsampled variant of ImageNet as an alternative to the CIFAR datasets. ArXiv preprint arXiv:1707.08819
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Article Google Scholar
Belue LM, Bauer KW Jr (1995) Determining input features for multilayer perceptrons. Neurocomputing 7(2):111–121
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Article Google Scholar
Celik MU, Sharma G, Tekalp AM, Saber E (2002) Reversible data hiding. In: Proceedings international conference on image processing, vol 2, p II. IEEE
LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404
Lin M, Chen Q, Yan S (2013) Network in network. ArXiv preprint arXiv:1312.4400
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, vol 1(2), p 3
Hornic K (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Leshno M, Vladimir YL, Pinkus A et al (1991) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867
Article Google Scholar
Heaton J, Goodfellow I, Bengio Y, Courville A (2017) Deep learning. Genet Program Evolvable Mach. https://doi.org/10.1007/s10710-017-9314-z
Article MATH Google Scholar
Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q et al (2017) Alignedreid: surpassing human-level performance in person re-identification. ArXiv preprint arXiv:1711.08184
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. ArXiv preprint arXiv:1502.03167
Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2015). Deeply-supervised nets. In: Artificial intelligence and statistics, pp 562–570
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was sponsored by Natural Science Foundation of Chongqing (No. E021D2019034), Chongqing Education Commission (No. E010J2019025), NSFC project (No. 61771146, 61375122), and in part by Shanghai Science and Technology Development Funds (No. 13dz2260200, 13511504300).

Author information

Authors and Affiliations

College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
Dawei Dai
Laboratory of Cognitive Model and Algorithms, Department of Computer Science, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China
Dawei Dai, Liping Yu & Hui Wei

Authors

Dawei Dai
View author publications
You can also search for this author in PubMed Google Scholar
Liping Yu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dawei Dai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dai, D., Yu, L. & Wei, H. Parameters Sharing in Residual Neural Networks. Neural Process Lett 51, 1393–1410 (2020). https://doi.org/10.1007/s11063-019-10143-4

Download citation

Published: 13 November 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11063-019-10143-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameters Sharing in Residual Neural Networks

Abstract

Access this article

Similar content being viewed by others

Developing a Hybrid Network Architecture for Deep Convolutional Neural Networks

Sharing ConvNet Across Heterogeneous Tasks

Study of Residual Networks for Image Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parameters Sharing in Residual Neural Networks

Abstract

Access this article

Similar content being viewed by others

Developing a Hybrid Network Architecture for Deep Convolutional Neural Networks

Sharing ConvNet Across Heterogeneous Tasks

Study of Residual Networks for Image Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation