Inference Performance Comparison of Convolutional Neural Networks on Edge Devices

Reza, Sheikh Rufsan; Yan, Yuzhong; Dong, Xishuang; Qian, Lijun

doi:10.1007/978-3-030-76063-2_23

Sheikh Rufsan Reza²¹,
Yuzhong Yan²¹,
Xishuang Dong²¹ &
…
Lijun Qian²¹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 372))

Included in the following conference series:

International Summit Smart City 360°

1125 Accesses

Abstract

With the proliferation of Internet of Things (IoT), large amount of data are generated at edge devices with an unprecedented speed. In order to protect the privacy and security of big edge data, as well as reduce the communications cost, it is desirable to process the data locally at the edge devices. In this study, the inference performance of several popular pre-trained convolutional neural networks on three edge computing devices are evaluated. Specifically, MobileNetV1 & V2 and InceptionV3 models have been tested on NVIDIA Jetson TX2, Jetson Nano, and Google Edge TPU for image classification. Furthermore, various compression techniques including pruning, quantization, binarized neural network, and tensor decomposition are applied to reduce the model complexity. The results will provide a guidance for practitioners when deploying deep learning models on resource constrained edge devices for near real-time and on-site learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comput. Netw. 54(15), 2787–2805 (2010)
Article Google Scholar
Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A survey on mobile edge computing: the communication perspective. IEEE Commun. Surv. Tutorials 19(4), 2322–2358 (2017)
Article Google Scholar
Neshenko, N., Bou-Harb, E., Crichigno, J., Kaddoum, G., Ghani, N.: Demystifying IoT security: an exhaustive survey on IoT vulnerabilities and a first empirical look on internet-scale IoT exploitations. IEEE Commun. Surv. Tutorials 21(3), 2702–2733 (2019)
Article Google Scholar
Jetson nano: Deep learning inference benchmarks. https://developer.nvidia.com/embedded/jetson-nano-dl-inference-benchmarks
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016). arXiv preprint arXiv:1611.06440
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4820–4828 (2016)
Google Scholar
Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: towards lossless CNNS with low-precision weights (2017). arXiv preprint arXiv:1702.03044
Zhao, R., et al.: Accelerating binarized convolutional neural networks with software-programmable FPGAs. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 15–24 (2017)
Google Scholar
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1 (2016). arXiv preprint arXiv:1602.02830
Cheng, T., et al.: Convolutional neural networks with low-rank regularization (2015). arXiv preprint arXiv:1511.06067
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Howard, A.G.: Some improvements on deep convolutional neural network based image classification (2013). arXiv preprint arXiv:1312.5402
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning, pp. 597–606 (2015)
Google Scholar
Yanai, K., Ryosuke Tanno, and Koichi Okamoto. Efficient mobile implementation of a cnn-based object recognition system. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 362–366 (2016)
Google Scholar
Li, X., Zhou, Y., Pan, Z., Feng, J.: Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Makhzani, A., Frey, B.J.: Winner-take-all autoencoders. In: Advances in Neural Information Processing Systems, pp. 2791–2799 (2015)
Google Scholar
Deep learning SDK documentation. https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html
Vanholder, H.: Efficient inference with tensorRT (2016)
Google Scholar
Real-time natural language understanding with BERT using tensorRT. https://devblogs.nvidia.com/nlu-with-tensorrt-bert/
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Sebastopol (2019)
Google Scholar
Tensorflow models on the edge TPU. https://coral.ai/docs/edgetpu/models-intro/#compatibility-overview
Internet of things. https://cloud.google.com/edge-tpu
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F.-F.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding (2015). arXiv preprint arXiv:1510.00149
Edge TPU performance benchmarks. https://coral.ai/docs/edgetpu/benchmarks/
Taylor, B., Marco, V.S., Wolff, W., Elkhatib, Y., Wang, Z.: Adaptive deep learning model selection on embedded systems. ACM SIGPLAN Notices 53(6), 31–43 (2018)
Article Google Scholar

Download references

Acknowledgment

This research work is supported by the U.S. Office of the Under Secretary of Defense for Research and Engineering (OUSD(R&E)) under agreement number FA8750-15-2-0119. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Office of the Under Secretary of Defense for Research and Engineering (OUSD(R&E)) or the U.S. Government.

Author information

Authors and Affiliations

Center of Excellence in Research and Education for Big Military Data Intelligence (CREDIT Center), Prairie View A&M University, Texas A&M University System, Prairie View, TX, 77446, USA
Sheikh Rufsan Reza, Yuzhong Yan, Xishuang Dong & Lijun Qian

Authors

Sheikh Rufsan Reza
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xishuang Dong
View author publications
You can also search for this author in PubMed Google Scholar
Lijun Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheikh Rufsan Reza .

Editor information

Editors and Affiliations

School of Technology and Management, Escola Superior de Tecnologia e Gestão, VIANA DO CASTELO, Portugal
Sara Paiva
Polytechnic Institute of Viana do Castel, Viana do Castelo, Portugal
Sérgio Ivan Lopes
SIC Laboratory, INSEEC U, ECE Paris Graduate School of Engineering, PARIS, France
Rafik Zitouni
Vaagdevi College of Engineering, Telangana, India
Nishu Gupta
University of Minho, Guimarães, Portugal
Sérgio F. Lopes
Nagoya University, Nagoya, Japan
Takuro Yonezawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Reza, S.R., Yan, Y., Dong, X., Qian, L. (2021). Inference Performance Comparison of Convolutional Neural Networks on Edge Devices. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T. (eds) Science and Technologies for Smart Cities. SmartCity360° 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 372. Springer, Cham. https://doi.org/10.1007/978-3-030-76063-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-76063-2_23
Published: 22 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76062-5
Online ISBN: 978-3-030-76063-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics