SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities

Yasrab, Robail

doi:10.1007/s11390-019-1950-8

SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities

Regular Paper
Published: 19 July 2019

Volume 34, pages 924–938, (2019)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Robail Yasrab¹

310 Accesses
8 Citations
Explore all metrics

Abstract

Convolutional neural networks (CNNs) have shown tremendous progress and performance in recent years. Since emergence, CNNs have exhibited excellent performance in most of classification and segmentation tasks. Currently, the CNN family includes various architectures that dominate major vision-based recognition tasks. However, building a neural network (NN) by simply stacking convolution blocks inevitably limits its optimization ability and introduces overfitting and vanishing gradient problems. One of the key reasons for the aforementioned issues is network singularities, which have lately caused degenerating manifolds in the loss landscape. This situation leads to a slow learning process and lower performance. In this scenario, the skip connections turned out to be an essential unit of the CNN design to mitigate network singularities. The proposed idea of this research is to introduce skip connections in NN architecture to augment the information flow, mitigate singularities and improve performance. This research experimented with different levels of skip connections and proposed the placement strategy of these links for any CNN. To prove the proposed hypothesis, we designed an experimental CNN architecture, named as Shallow Wide ResNet or SRNet, as it uses wide residual network as a base network design. We have performed numerous experiments to assess the validity of the proposed idea. CIFAR-10 and CIFAR-100, two well-known datasets are used for training and testing CNNs. The final empirical results have shown a great many of promising outcomes in terms of performance, efficiency and reduction in network singularities issues.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active weighted mapping-based residual convolutional neural network for image classification

Article 24 September 2020

UDenseNet: A Universal Dense Convolutional Network for Image Recognition

Fully automatic CNN design with inception and ResNet blocks

Article 30 September 2022

References

Krizhevsky A, Ilya S, Geoffrey E H. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Annual Conference on Neural Information Processing Systems, December 2012, pp.1106-1114.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Berg A C. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211-252.
Article MathSciNet Google Scholar
LeCun Y, Yoshua B, Geoffrey E H. Deep learning. Nature, 2015, 521(7553): 436-444.
Article Google Scholar
Zou W Y, Wang X, Sun M, Lin Y. Generic object detection with dense neural patterns and regionlets. arXiv:1404.4316, 2014. https://arxiv.org/abs/1404.4316, July 2018.
Lin M, Chen Q, Yan S. Network in network. arXiv: 13-12.4400, 2013. https://arxiv.org/abs/1312.4400, July 2018.
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Le-Cun Y. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229, 2013. https://arxiv.org/abs/1312.6229, July 2018.
Simonyan K. Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556, July 2018.
Yasrab R. ECRU: An encoder-decoder based convolution neural network (CNN) for road-scene understanding. Journal of Imaging, 2018, 4(10): Article No. 116.
Yasrab R, Gu N, Zhang X. SCNet: A simplified encoder-decoder CNN for semantic segmentation. In Proc. the 5th International Conference on Computer Science and Network Technology, December 2016, pp.785-789.
Yasrab R, Gu N, Zhang X. An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Applied Sciences, 2017, 7(4): Article No. 312.
Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. In Proc. the 30th International Conference on Machine Learning, June 2013, pp.1139-1147.
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. the 13th International Conference on Artificial Intelligence and Statistics, May 2010, pp.249-256.
He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proc. the 2015 IEEE International Conference on Computer Vision, December 2015, pp.1026-1034.
Lee C Y, Xie S, Gallagher P, Zhang Z, Tu Z. Deeply-supervised nets. In Proc. the 18th International Conference on Artificial Intelligence and Statistics, May 2015, pp.562-570.
Raiko T, Valpola H, LeCun Y. Deep learning made easier by linear transformations in perceptrons. In Proc. the 15th International Conference on Artificial Intelligence and Statistics, April 2012, pp.924-932.
Schmidhuber J. Learning complex, extended sequences using the principle of history compression. Neural Computation, 1992, 4(2): 234-242.
Article Google Scholar
Chen T, Goodfellow I, Shlens J. Net2net: Accelerating learning via knowledge transfer. arXiv:1511.05641, 2015. https://arxiv.org/abs/1511.05641, November 2018.
Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. arXiv: 1412.6-550, 2014. https://arxiv.org/abs/1412.6550, July 2018.
Wei H, Zhang J, Cousseau F, Ozeki T, Amari S. Dynamics of learning near singularities in layered networks. Neural Computation, 2008, 20(3): 813-843.
Article MathSciNet MATH Google Scholar
Amari S I, Park H, Ozeki T. Singularities affect dynamics of learning in neuromanifolds. Neural Computation, 2006, 18(5), 1007-1065.
Article MathSciNet MATH Google Scholar
Saxe A M, McClelland J L, Ganguli S. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv:1312.6120, 2013. https://arxiv.org/abs/1312.6120, August 2018.
Orhan A E, Pitkow X. Skip connections eliminate singularities. arXiv:1701.09175, 2017. https://arxiv.org/abs/17-01.09175, September 2018.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.770-778.
Huang G, Sun Y, Liu Z, Sedra D, Weinberger K Q. Deep networks with stochastic depth. In Proc. the 14th European Conference on Computer Vision, October 2016, pp.646-661.
He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In Proc. the 14th European Conference on Computer Vision, October 2016, pp.630-645.
Srivastava R K, Greff K, Schmidhuber J. Highway networks. arXiv:1505.00387, 2015. https://arxiv.org/abs/1505.00387, June 2018.
Zhang K, Sun M, Han X, Yuan X, Guo L, Liu T. Residual networks of residual networks: Multilevel residual networks. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(6): 1303-1314.
Article Google Scholar
Zhang K, Guo L, Gao C, Zhao Z. Pyramidal RoR for image classification. arXiv:1710.00307, 2017. https://arxiv.org/abs/1710.00307, May 2018.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.1-9.
Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 1994, 5(2): 157-166.
Article Google Scholar
Shen F, Gan R, Zeng G. Weighted residuals for very deep networks. In Proc. the 3rd International Conference on Systems and Informatics, November 2016, pp.936-941.
Bengio Y, LeCun Y. Scaling learning algorithms towards AI. In Large-Scale Kernel Machines, Bottou L, Chapelle O, DeCoste D, Weston J (eds.), MIT Press, 2017.
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y. An empirical evaluation of deep architectures on problems with many factors of variation. In Proc. the 24th International Conference on Machine Learning, June 2007, pp.473-480.
Zagoruyko S, Komodakis N. Wide residual networks. arXiv:1605.07146, 2016. https://arxiv.org/abs/1605.07146, January 2019.
Srivastava N, Hinton G E, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
MathSciNet MATH Google Scholar
Huang G, Liu Z, Weinberger K Q, Maaten L. Densely connected convolutional networks. arXiv:1608.06993, 2016. https://arxiv.org/abs/1608.06993, September 2018.
Han D, Kim J, Kim J. Deep pyramidal residual networks. arXiv:1610.02915, 2016. https://arxiv.org/abs/1610.02915, July 2018.
Xie S, Girshick R, Dollár P, Tu Z, He K. Aggregated residual transformations for deep neural networks. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.5987-5995.
Szegedy C, Loffe S, Vanhoucke V, Alemi A A. Inception-v4, Inception-ResNet and the impact of residual connections on learning. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.4278-4284.
Loffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proc. the 32nd International Conference on Machine Learning, July 2015, pp.448-456.
Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines. In Proc. the 27th International Conference on Machine Learning, June 2010, pp.807-814.
Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580, 2012. https://arxiv.org/abs/1207.0580, July 2018.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: Convolutional architecture for fast feature embedding. In Proc. the 22nd ACM International Conference on Multimedia, November 2014, pp.675-678.
LeCun Y, Boser B, Denker J S, Henderson D, Howard R E, Hubbard W, Jackel L D. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989, 1(4): 541-551.
Article Google Scholar
Rastegari M, Ordonez V, Redmon J, Farhadi A. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Proc. the 14th European Conference on Computer Vision, October 2016, pp.525-542.
Sheen S, Lyu J. Median binary-connect method and a binary convolutional neural network for word recognition. arXiv:1811.02784v1, 2018. https://arxiv.org/abs/18-11.02784v1, December 2018.
Lin X, Zhao C, Pan W. Towards accurate binary convolutional neural network. In Proc. the 2017 Annual Conference on Neural Information Processing Systems, December 2017, pp.344-352.
Juefei-Xu F, Boddeti V N, Savvides M. Local binary convolutional neural networks. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.4284-4293.

Download references

Acknowledgement(s)

I would like to acknowledge Tony Pridmore, Michael Pound, Khan Faraz, Mohammadreza Soltaninejad and John Atanbori of Computer Vision Laboratory, School of Computer Science, University of Nottingham, for insightful discussions.

Author information

Authors and Affiliations

Computer Vision Laboratory, School of Computer Science, University of Nottingham, Nottingham, NG8-1BB, U.K.
Robail Yasrab

Authors

Robail Yasrab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robail Yasrab.

Electronic supplementary material

ESM 1

(PDF 654 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yasrab, R. SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities. J. Comput. Sci. Technol. 34, 924–938 (2019). https://doi.org/10.1007/s11390-019-1950-8

Download citation

Received: 12 June 2018
Revised: 24 May 2019
Published: 19 July 2019
Issue Date: July 2019
DOI: https://doi.org/10.1007/s11390-019-1950-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities

Abstract

Access this article

Similar content being viewed by others

Active weighted mapping-based residual convolutional neural network for image classification

UDenseNet: A Universal Dense Convolutional Network for Image Recognition

Fully automatic CNN design with inception and ResNet blocks

References

Acknowledgement(s)

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities

Abstract

Access this article

Similar content being viewed by others

Active weighted mapping-based residual convolutional neural network for image classification

UDenseNet: A Universal Dense Convolutional Network for Image Recognition

Fully automatic CNN design with inception and ResNet blocks

References

Acknowledgement(s)

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation