Abstract
Edge intelligence, as a new computing paradigm, aims to allocate Artificial Intelligence (AI)-based tasks partly on the edge to execute for reducing latency, consuming energy and improving privacy. As the most important technique of AI, Deep Neural Networks (DNN) has been widely used in various fields. And for those DNN based tasks, a new computing scheme named DNN model partition can further reduce the execution time. This computing scheme partitions the DNN task into two parts, one will be executed on the end devices and the other will be executed on edge servers. However, in a complex edge computing system, it is difficult to coordinate DNN model partition and task allocation. In this work, we study this problem in the heterogeneous edge computing system. We first establish the mathematical model of adaptive DNN model partition and task offloading. The mathematical model contains a large number of binary variables, and the solution space will be too large to be solved directly in a multi-task scenario. Then we use dynamic programming and greedy strategy to reduce the solution space under the premise of a good solution, and propose our offline algorithm named GSPI. Then considering the actual situation, we subsequently proposed the online algorithm. Through our experiments and simulations, we proved that our proposed GSPI algorithm can reduce the system time cost by at least 32% and the online algorithm can reduce the system time cost by at least 24%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC (2015)
Chen, C., Seff, A., Kornhauser, A.L., Xiao, J.: DeepDriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2722–2730 (2015)
Chan, W., Jaitly, N., Le, Q.V., Vinyals, O.: Listen, attend and spell: a neural network for large vocabulary conversational speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4960–4964 (2016)
Snyder, T., Byrd, G.: The internet of everything. Computer 50(6), 8–9 (2017)
Pandey, P., Singh, S., Singh, S.: Cloud computing. In: ICWET (2010)
Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3, 637–646 (2016)
Long, C., Cao, Y., Jiang, T., Zhang, Q.: Edge computing framework for cooperative video processing in multimedia IoT systems. IEEE Trans. Multimedia 20, 1126–1139 (2018)
Deschamps-Sonsino, A.: Smarter Homes. Apress, New York (2018)
Alba, E., Chicano, F., Luque, G. (eds.): Smart-CT 2016. LNCS, vol. 9704. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39595-1
Liu, L., Huang, H., Tan, H., Cao, W., Yang, P., Li, X.-Y.: Online DAG scheduling with on-demand function configuration in edge computing. In: Biagioni, E.S., Zheng, Y., Cheng, S. (eds.) WASA 2019. LNCS, vol. 11604, pp. 213–224. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23597-0_17
Mao, Y., Zhang, J., Letaief, K.B.: Dynamic computation offloading for mobile-edge computing with energy harvesting devices. IEEE J. Sel. Areas Commun. 34, 3590–3605 (2016)
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107, 1738–1762 (2019)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. CoRR abs/1510.00149 (2015)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:abs/1704.04861 (2017)
Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Kang, Y., et al.: Neurosurgeon: collaborative intelligence between the cloud and mobile edge. In: ASPLOS 2017 (2017)
Xu, M., Qian, F., Zhu, M., Huang, F., Pushp, S., Liu, X.: DeepWear: adaptive local offloading for on-wearable deep learning. IEEE Trans. Mob. Comput. 19, 314–330 (2020)
Li, E., Zeng, L., Zhou, Z., Chen, X.: Edge AI: on-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Commun. 19, 447–457 (2020)
Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469 (2016)
Hu, C., Bao, W.S., Wang, D., Liu, F.: Dynamic adaptive DNN surgery for inference acceleration on the edge. In: IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, pp. 1423–1431 (2019)
Ko, J.H., Na, T., Amir, M.F., Mukhopadhyay, S.: Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained Internet-of-Things platforms. In: 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2018)
Lin, B., Huang, Y., Zhang, J., Hu, J., Chen, X., Li, J.: Cost-driven off-loading for DNN-based applications over cloud, edge, and end devices. IEEE Trans. Industr. Inf. 16, 5456–5466 (2020)
Shi, C., Chen, L., Shen, C., Song, L., Xu, J.: Privacy-aware edge computing based on adaptive DNN partitioning. In: 2019 IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2019)
Qassim, H., Feinzimer, D., Verma, A.: Residual squeeze VGG16. arXiv:abs/1705.03004 (2017)
Acknowledgements
This article was supported by the National Key Research And Development Plan (Grant No. 2018YFB2000505), National Natural Science Foundation of China (Grant No. 61806067) and Key Research and Development Project in Anhui Province (Grant No. 201904a06020024).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Shi, L., Xu, Z., Shi, Y., Fan, Y., Ding, X., Sun, Y. (2021). A DNN Inference Acceleration Algorithm in Heterogeneous Edge Computing: Joint Task Allocation and Model Partition. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Gu, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-030-67537-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-67537-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67536-3
Online ISBN: 978-3-030-67537-0
eBook Packages: Computer ScienceComputer Science (R0)