Skip to main content

Inference Performance Comparison of Convolutional Neural Networks on Edge Devices

  • Conference paper
  • First Online:
Science and Technologies for Smart Cities (SmartCity360° 2020)

Abstract

With the proliferation of Internet of Things (IoT), large amount of data are generated at edge devices with an unprecedented speed. In order to protect the privacy and security of big edge data, as well as reduce the communications cost, it is desirable to process the data locally at the edge devices. In this study, the inference performance of several popular pre-trained convolutional neural networks on three edge computing devices are evaluated. Specifically, MobileNetV1 & V2 and InceptionV3 models have been tested on NVIDIA Jetson TX2, Jetson Nano, and Google Edge TPU for image classification. Furthermore, various compression techniques including pruning, quantization, binarized neural network, and tensor decomposition are applied to reduce the model complexity. The results will provide a guidance for practitioners when deploying deep learning models on resource constrained edge devices for near real-time and on-site learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comput. Netw. 54(15), 2787–2805 (2010)

    Article  Google Scholar 

  2. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A survey on mobile edge computing: the communication perspective. IEEE Commun. Surv. Tutorials 19(4), 2322–2358 (2017)

    Article  Google Scholar 

  3. Neshenko, N., Bou-Harb, E., Crichigno, J., Kaddoum, G., Ghani, N.: Demystifying IoT security: an exhaustive survey on IoT vulnerabilities and a first empirical look on internet-scale IoT exploitations. IEEE Commun. Surv. Tutorials 21(3), 2702–2733 (2019)

    Article  Google Scholar 

  4. Jetson nano: Deep learning inference benchmarks. https://developer.nvidia.com/embedded/jetson-nano-dl-inference-benchmarks

  5. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861

  6. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  7. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  8. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016). arXiv preprint arXiv:1611.06440

  9. Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4820–4828 (2016)

    Google Scholar 

  10. Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: towards lossless CNNS with low-precision weights (2017). arXiv preprint arXiv:1702.03044

  11. Zhao, R., et al.: Accelerating binarized convolutional neural networks with software-programmable FPGAs. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 15–24 (2017)

    Google Scholar 

  12. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1 (2016). arXiv preprint arXiv:1602.02830

  13. Cheng, T., et al.: Convolutional neural networks with low-rank regularization (2015). arXiv preprint arXiv:1511.06067

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  15. Howard, A.G.: Some improvements on deep convolutional neural network based image classification (2013). arXiv preprint arXiv:1312.5402

  16. Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22

    Chapter  Google Scholar 

  17. Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning, pp. 597–606 (2015)

    Google Scholar 

  18. Yanai, K., Ryosuke Tanno, and Koichi Okamoto. Efficient mobile implementation of a cnn-based object recognition system. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 362–366 (2016)

    Google Scholar 

  19. Li, X., Zhou, Y., Pan, Z., Feng, J.: Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  20. Makhzani, A., Frey, B.J.: Winner-take-all autoencoders. In: Advances in Neural Information Processing Systems, pp. 2791–2799 (2015)

    Google Scholar 

  21. Deep learning SDK documentation. https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html

  22. Vanholder, H.: Efficient inference with tensorRT (2016)

    Google Scholar 

  23. Real-time natural language understanding with BERT using tensorRT. https://devblogs.nvidia.com/nlu-with-tensorrt-bert/

  24. Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Sebastopol (2019)

    Google Scholar 

  25. Tensorflow models on the edge TPU. https://coral.ai/docs/edgetpu/models-intro/#compatibility-overview

  26. Internet of things. https://cloud.google.com/edge-tpu

  27. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F.-F.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  28. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding (2015). arXiv preprint arXiv:1510.00149

  29. Edge TPU performance benchmarks. https://coral.ai/docs/edgetpu/benchmarks/

  30. Taylor, B., Marco, V.S., Wolff, W., Elkhatib, Y., Wang, Z.: Adaptive deep learning model selection on embedded systems. ACM SIGPLAN Notices 53(6), 31–43 (2018)

    Article  Google Scholar 

Download references

Acknowledgment

This research work is supported by the U.S. Office of the Under Secretary of Defense for Research and Engineering (OUSD(R&E)) under agreement number FA8750-15-2-0119. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Office of the Under Secretary of Defense for Research and Engineering (OUSD(R&E)) or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheikh Rufsan Reza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reza, S.R., Yan, Y., Dong, X., Qian, L. (2021). Inference Performance Comparison of Convolutional Neural Networks on Edge Devices. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T. (eds) Science and Technologies for Smart Cities. SmartCity360° 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 372. Springer, Cham. https://doi.org/10.1007/978-3-030-76063-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76063-2_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76062-5

  • Online ISBN: 978-3-030-76063-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics