An Experimental Perspective for Computation-Efficient Neural Networks Training

Yin, Lujia; Chen, Xiaotao; Qin, Zheng; Zhang, Zhaoning; Feng, Jinghua; Li, Dongsheng

doi:10.1007/978-981-13-2423-9_13

Lujia Yin¹⁰,
Xiaotao Chen¹⁰,
Zheng Qin¹⁰,
Zhaoning Zhang¹⁰,
Jinghua Feng¹⁰ &
…
Dongsheng Li¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 908))

Included in the following conference series:

Conference on Advanced Computer Architecture

787 Accesses
2 Citations

Abstract

Nowadays, as the tremendous requirements of computation-efficient neural networks to deploy deep learning models on inexpensive and broadly-used devices, many lightweight networks have been presented, such as MobileNet series, ShuffleNet, etc. The computation-efficient models are specifically designed for very limited computational budget, e.g., 10–150 MFLOPs, and can run efficiently on ARM-based devices. These models have smaller CMR than the large networks, such as VGG, ResNet, Inception, etc.

However, it is quite efficient for inference on ARM, how about inference or training on GPU? Unfortunately, compact models usually cannot make full utilization of GPU, though it is fast for its small size. In this paper, we will present a series of extensive experiments on the training of compact models, including training on single host, with GPU and CPU, and distributed environment. Then we give some analysis and suggestions on the training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It is Chipset Qualcomm MSM8996 Snapdragon 821, CPU Quad-core (4\(\,\times \,\)2.15/2.16 GHz Kryo).
2.
Unlike the original papers, the computational complexity and the memory accesses also include the pooling, lateral and activation layers.

References

Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions, pp. 1–9 (2014)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., et al.: Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI, vol. 4, p. 12 (2017)
Google Scholar
Chollet, F.: Xception: deep learning with depth wise separable convolutions. arXiv preprint (2016)
Google Scholar
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., Weinberger, K.Q., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, no. 2, p. 3 (2017)
Google Scholar
Huang, J., Rathod, V., Sun, C., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE CVPR (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587. IEEE Computer Society (2014)
Google Scholar
Ren, S., He, K., Girshick, R.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 3431–3440. IEEE (2015)
Google Scholar
Howard, A.G., Zhu, M., Chen, B., et al.: MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Sandler, M., Howard, A., Zhu, M., et al.: Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. arXiv preprint arXiv:1801.04381 (2018)
Zhang, X., Zhou, X., Lin, M., et al.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083 (2017)
Qin, Z., Zhang, Z., Chen, X., et al.: FD-MobileNet: improved MobileNet with a fast down sampling strategy. arXiv preprint arXiv:1802.03750 (2018)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Goyal, P., Dollár, P., Girshick, R., et al.: Accurate, large minibatch SGD: training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)
You, Y., Zhang, Z., Hsieh, C.J., et al.: 100-epoch ImageNet training with AlexNet in 24 minutes. ArXiv e-prints (2017)
Google Scholar
Gysel, P., Motamedi, M., Ghiasi, S.: Hardware-oriented approximation of convolutional neural networks (2016)
Google Scholar
Mathew, M., Desappan, K., Swami, P.K., et al.: Sparse, quantized, full frame CNN for low power embedded devices. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 328–336. IEEE Computer Society (2017)
Google Scholar
Li, M.: Scaling distributed machine learning with the parameter server, p. 1 (2014)
Google Scholar
Chen, T., Li, M., Li, Y., et al.: MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. Statistics (2015)
Google Scholar
InfiniBand Trade Association: InfiniBand Architecture Specification: Release 1.0 (2000)
Google Scholar
Padovano, M.: System and method for accessing a storage area network as network attached storage: WO, US6606690[P] (2003)
Google Scholar
Kågström, B., Ling, P., van Loan, C.: GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark. ACM Trans. Math. Softw. (TOMS) 24(3), 268–302 (1998)
Article Google Scholar
Williams, S., Patterson, D., Oliker, L., et al.: The roofline model: a pedagogical tool for auto-tuning kernels on multicore architectures. In: Hot Chips, vol. 20, pp. 24–26 (2008)
Google Scholar
Sifre, L.: Rigid-motion scattering for image classification. Ph.D. thesis (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Science and Technology on Parallel and Distributed Laboratory, National University of Defense Technology, Changsha, China
Lujia Yin, Xiaotao Chen, Zheng Qin, Zhaoning Zhang, Jinghua Feng & Dongsheng Li

Authors

Lujia Yin
View author publications
You can also search for this author in PubMed Google Scholar
Xiaotao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Qin
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoning Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinghua Feng
View author publications
You can also search for this author in PubMed Google Scholar
Dongsheng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaoning Zhang .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Chao Li
National University of Defense Technology, Changsha, China
Junjie Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, L., Chen, X., Qin, Z., Zhang, Z., Feng, J., Li, D. (2018). An Experimental Perspective for Computation-Efficient Neural Networks Training. In: Li, C., Wu, J. (eds) Advanced Computer Architecture. ACA 2018. Communications in Computer and Information Science, vol 908. Springer, Singapore. https://doi.org/10.1007/978-981-13-2423-9_13

Download citation

DOI: https://doi.org/10.1007/978-981-13-2423-9_13
Published: 13 September 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2422-2
Online ISBN: 978-981-13-2423-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)