Accelerating Deep Convolutional Neural on GPGPU

Żurek, Dominik; Pietroń, Marcin; Wiatr, Kazimierz

doi:10.1007/978-3-030-80126-7_50

Dominik Żurek¹⁰,
Marcin Pietroń¹⁰ &
Kazimierz Wiatr¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 284))

1988 Accesses

Abstract

This paper is focused on the improvement of the efficiency of the sparse convolutional neural networks (CNNs) layers on graphic processing units (GPU). The Nvidia deep neural network (cuDnn) library provides the most effective implementation of deep learning (DL) algorithms for GPUs. GPUs are one of the most efficient and commonly used accelerators for deep learning computations. The modern CNN models need megabytes of coefficients and needed millions MAC operations to perform convolution. One of the most common techniques for compressing CNN models is weight pruning. There are two main types of pruning: structural (based on removing whole weight channels) and non-structural (removing individual weights). The first enables much easier acceleration, but with this type it is difficult to achieve a sparsity level and accuracy as high as that obtained with the second type. Non-structural pruning with retraining can generate a matrix-weight up to \({\sim }90\%\) or more of sparsity in some deep CNN models. This work shows when is worth using a direct sparse operation to speed-up the calculation of the convolution layers. The VGG-16, CNN-non-static and \(1 \times 1\) layers from ResNet models were used as a benchmarks. In addition, we present the impact of using reduced precision on time efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Adámek, K., Dimoudi, S., Giles, M., Armour, W.: GPU fast convolution via the overlap-and-save method in shared memory (2019)
Google Scholar
Al-Hami, M., Pietron, M., Casas, R., Wielgosz, M.: Methodologies of compressing a stable performance convolutional neural networks in image classification, January 2020
Google Scholar
Chen, X.: Escoin: efficient sparse convolutional neural network inference on GPUs (2018)
Google Scholar
Chetlur, S., et al.: cuDNN: efficient primitives for deep learning (2014)
Google Scholar
Dongarra, J.J., Hammarling, S., Higham, N.J., Relton, S.D., Valero-Lara, P., Zounon, M.: The design and performance of batched BLAS on modern high-performance computing systems. In: ICCS (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Jordà, M., Valero-Lara, P., Peña, A.J.: Performance evaluation of cuDNN convolution algorithms on NVIDIA Volta GPUs. IEEE Access 7, 70461–70473 (2019)
Article Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1746–1751. Association for Computational Linguistics, October 2014
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Neural Inf. Process. Syst. 25, 01 (2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Lavin, A., Gray, S.: Fast algorithms for convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4013–4021 (2016)
Google Scholar
Lee, H., Kwon, H.: Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans. Image Process. 26(10), 4843–4855 (2017)
Article MathSciNet Google Scholar
Liu, B., Wang, M., Foroosh, H., Tappen, M., Pensky, M.: Sparse convolutional neural networks, pp. 806–814 (2015)
Google Scholar
Lu, L., Liang, Y.: SpWA: an efficient sparse winograd convolutional neural networks accelerator on FPGAs. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp. 1–6 (2018)
Google Scholar
Park, J., et al.: Faster CNNs with direct sparse convolutions and guided pruning (2016)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Google Scholar
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. CoRR, abs/1710.05941 (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 06 (2015)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
Google Scholar
Winograd, S.: Arithmetic Complexity of Computations. CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (1980)
Google Scholar
Wróbel, K., Karwatowski, M., Wielgosz, M., Pietroń, M., Wiatr, K.: Compression of convolutional neural network for natural language processing. Comput. Sci. 21(1) (2020)
Google Scholar
Yin, W., Kann, K., Yu, M., Schütze, H.: Comparative study of CNN and RNN for natural language processing (2017)
Google Scholar
Zhu, F., Pool, J., Andersch, M., Appleyard, J., Xie, F.: Sparse persistent RNNs: squeezing large recurrent networks on-chip (2018)
Google Scholar

Download references

Acknowledgment

This work has been supported by the funds provided by AGH University of Science and Technology in 2020.

Author information

Authors and Affiliations

AGH University of Science and Technology, al. Adama Mickiewicza 30, 30-059, Krakow, Poland
Dominik Żurek, Marcin Pietroń & Kazimierz Wiatr

Authors

Dominik Żurek
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Pietroń
View author publications
You can also search for this author in PubMed Google Scholar
Kazimierz Wiatr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominik Żurek .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Żurek, D., Pietroń, M., Wiatr, K. (2021). Accelerating Deep Convolutional Neural on GPGPU. In: Arai, K. (eds) Intelligent Computing. Lecture Notes in Networks and Systems, vol 284. Springer, Cham. https://doi.org/10.1007/978-3-030-80126-7_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-80126-7_50
Published: 07 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80125-0
Online ISBN: 978-3-030-80126-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics