Analysis of Gradient Degradation and Feature Map Quality in Deep All-Convolutional Neural Networks Compared to Deep Residual Networks

Gao, Wei; McDonnell, Mark D.

doi:10.1007/978-3-319-70096-0_63

Wei Gao¹⁸ &
Mark D. McDonnell¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10635))

Included in the following conference series:

International Conference on Neural Information Processing

7899 Accesses

Abstract

The introduction of skip connections used for summing feature maps in deep residual networks (ResNets) were crucially important for overcoming gradient degradation in very deep convolutional neural networks (CNNs). Due to the strong results of ResNets, it is a natural choice to use features that it produces at various layers in transfer learning or for other feature extraction tasks. In order to analyse how the gradient degradation problem is solved by ResNets, we empirically investigate how discriminability changes as inputs propagate through the intermediate layers of two CNN variants: all-convolutional CNNs and ResNets. We found that the feature maps produced by residual-sum layers exhibit increasing discriminability with layer-distance from the input, but that feature maps produced by convolutional layers do not. We also studied how discriminability varies with training duration and the placement of convolutional layers. Our method suggests a way to determine whether adding extra layers will improve performance and show how gradient degradation impacts on which layers contribute increased discriminability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105, Lake Tahoe (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, Banff (2014)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2015)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, Sardinia (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2015)
Google Scholar
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: Proceedings of International Conference on Learning Representations, San Diego (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. Preprint arXiv.1603.05027, Microsoft Research (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of 32nd International Conference on Machine Learning, Lille (2015)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2015)
MATH MathSciNet Google Scholar
Velt, A., Wilber, M., Belongie, S.: Residual networks are exponential ensembles of relatively shallow networks. Preprint arXiv.1605.06431 (2016)
Zagoruyko, S., Komodakis, N.: Wide residual networks. Preprint arXiv.1605.07146 (2016)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, Zurich (2014)
Google Scholar
Sullivan, J., Razavian, A., Azizpour, H., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 512–519, Columbus (2014)
Google Scholar
McDonnell, M.D., Vladusich, T.: Enhanced image classification with a fast-learning shallow convolutional neural network. In: Proceedings of 2015 International Joint Conference on Neural Networks, Ireland (2015)
Google Scholar
Guillaume, A., Bengio, Y.: Understanding intermediate layers using linear classifier probes. Preprint arXiv.1610.01644 (2016)
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Article Google Scholar
McDonnell, M.D., Tissera, M.D., Vladusich, T., van Schaik, A., Tapson, J.: Fast, simple and accurate handwritten digit classification by training shallow neural network classifiers with the extreme learning machine algorithm. PLoS One 10, 1–20 (2015). Article no. e0134254
Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Ph.D. thesis, University of Toronto (2005)
Google Scholar
Mohamed, A., Dahl, G.E., Hinton, G.E.: Deep belief networks for phone recognition. In: Workshop of Advances in Neural Information Processing Systems, Vancouver (2009)
Google Scholar
Sainath, T.N., Weiss, R.J., Senior, A., Wilson, K.W., Vinyals, O.: Learning the speech front-end with raw waveform CLDNNs. In: Proceedings of Interspeech, Dresden (2015)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with restarts. Preprint arXiv.1608.03983 (2016)

Download references

Acknowledgements

This work is funded by Australian Government Research Training Program (RTP) Scholarship.

Author information

Authors and Affiliations

Computational Learning Systems Laboratory, School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA, 5095, Australia
Wei Gao & Mark D. McDonnell

Authors

Wei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Mark D. McDonnell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Gao .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, W., McDonnell, M.D. (2017). Analysis of Gradient Degradation and Feature Map Quality in Deep All-Convolutional Neural Networks Compared to Deep Residual Networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_63

Download citation

DOI: https://doi.org/10.1007/978-3-319-70096-0_63
Published: 26 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics