ABSTRACT
Convolutional neural network (CNN)-based object detection is a key technology to enable autonomous mobile vision applications on mobile end devices such as smart phones and drones. With the advance of edge computing technology, a prevalent solution is to offload the computation-intensive CNN inference tasks to edge networks for fast and accurate object detection. However, a single edge server may not be powerful enough to ensure fast and accurate object detection due to its resource constraints. In this paper, we propose a multi-edge assisted fast object detection framework, MASS, to further reduce the object detection latency while maintaining the detection accuracy. In MASS, the CNN model is divided into two parts, namely the Head part and the Tail part. The Head part is executed locally and the Tail part is further split into multiple subtasks, which can be offloaded to multiple edge servers for parallel execution. First, we propose a method to select the optimal parallel entry point to separate the Head part from the Tail part. Then, an adaptive subtask generation and offloading strategy is proposed to divide the Tail part into multiple subtasks and offload these subtasks to multiple heterogeneous edge servers. Besides, we also propose a uniformly sampled zero-padding scheme to reduce the communication cost among edge servers when executing these subtasks in parallel. We implement MASS in a testbed with four edge servers and evaluate its performance, where the experimental results show that MASS can reduce object detection latency by up to $64.83%$, while the detection accuracy degradation is less than $3%$.
- Zhengxia Zou, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. CoRR, abs/1905.05055, 2019.Google Scholar
- Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. A comprehensive survey on model compression and acceleration. Artif. Intell. Rev., 53(7):5113--5155, 2020.Google ScholarDigital Library
- Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc. IEEE, 108(4):485--532, 2020.Google ScholarCross Ref
- Jangwon Lee, Jingya Wang, David J. Crandall, Selma Sabanovic, and Geoffrey C. Fox. Real-time, cloud-based object detection for unmanned aerial vehicles. In First IEEE International Conference on Robotic Computing, IRC 2017, Taichung, Taiwan, April 10--12, 2017, pages 36--43. IEEE Computer Society, 2017.Google ScholarCross Ref
- Yiwen Han, Xiaofei Wang, Victor C. M. Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen. Convergence of edge computing and deep learning: A comprehensive survey. CoRR, abs/1907.08349, 2019.Google Scholar
- Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8):1738--1762, 2019.Google ScholarCross Ref
- Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Distributed deep neural networks over the cloud, the edge and end devices. In 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017, Atlanta, GA, USA, June 5--8, 2017, pages 328--339. IEEE Computer Society, 2017.Google ScholarCross Ref
- Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. Dynamic adaptive DNN surgery for inference acceleration on the edge. In 2019 IEEE Conference on Computer Communications, INFOCOM 2019, Paris, France, April 29 - May 2, 2019, pages 1423--1431. IEEE, 2019.Google ScholarCross Ref
- Shigeng Zhang, Yinggang Li, Xuan Liu, Song Guo, Weiping Wang, Jianxin Wang, Bo Ding, and Di Wu. Towards real-time cooperative deep inference over the cloud and edge end devices. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 4(2):69:1--69:24, 2020. Google ScholarDigital Library
- Mikolaj Jankowski, Deniz Gü ndü z, and Krystian Mikolajczyk. Joint device-edge inference over wireless links with pruning. In 21st IEEE International Workshop on Signal Processing Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26--29, 2020, pages 1--5. IEEE, 2020.Google ScholarCross Ref
- Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, Marco Gruteser, Dipankar Raychaudhuri, and Yanyong Zhang. Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, pages 201--214, 2021. Google ScholarDigital Library
- Rafael Stahl, Zhuoran Zhao, Daniel Mueller-Gritschneder, Andreas Gerstlauer, and Ulf Schlichtmann. Fully distributed deep learning inference on resource-constrained edge devices. In Embedded Computer Systems: Architectures, Modeling, and Simulation - 19th International Conference, SAMOS 2019, Samos, Greece, July 7--11, 2019, Proceedings, volume 11733 of Lecture Notes in Computer Science, pages 77--90. Springer, 2019.Google Scholar
- Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodorescu. Adaptive parallel execution of deep neural networks on heterogeneous edge devices. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, Arlington, Virginia, USA, November 7--9, 2019, pages 195--208. ACM, 2019. Google ScholarDigital Library
- Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Conference on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6--9, 2020, pages 854--863. IEEE, 2020.Google ScholarDigital Library
- Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. Deepthings: Distributed adaptive deep learning inference on resource-constrained iot edge clusters. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 37(11):2348--2359, 2018.Google ScholarCross Ref
- Sai Qian Zhang, Jieyu Lin, and Qi Zhang. Adaptive distributed convolutional neural network inference at the network edge with ADCNN. In ICPP 2020: 49th International Conference on Parallel Processing, Edmonton, AB, Canada, August 17--20, 2020, pages 10:1--10:11. ACM, 2020.Google ScholarDigital Library
- NVIDIA. Jetson tx2 module. https://developer.nvidia.com/embedded/jetson-tx2.Google Scholar
- HUAWEI. Huawei smart station atlas 500. https://e.huawei.com/cn/products/cloud-computing-dc/atlas/atlas-500.Google Scholar
- En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. Edge AI: on-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Communications, 19(1):447--457, 2020.Google ScholarCross Ref
- Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell., 39(6):1137--1149, 2017. Google ScholarDigital Library
- Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang Fu, and Alexander C. Berg. SSD: single shot multibox detector. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part I , volume 9905 of Lecture Notes in Computer Science, pages 21--37. Springer, 2016.Google ScholarCross Ref
- Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016, pages 779--788. IEEE Computer Society, 2016.Google ScholarCross Ref
- Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet V2: practical guidelines for efficient CNN architecture design. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XIV, volume 11218 of Lecture Notes in Computer Science, pages 122--138. Springer, 2018.Google ScholarCross Ref
- Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. Pruning convolutional neural networks for resource efficient inference. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net, 2017.Google Scholar
- Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel S. Emer, Stephen W. Keckler, and William J. Dally. SCNN: an accelerator for compressed-sparse convolutional neural networks. In Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24--28, 2017, pages 27--40. ACM, 2017. Google ScholarDigital Library
- Norman P. Jouppi, Cliff Young, and Nishant Patil et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24--28, 2017, pages 1--12. ACM, 2017. Google ScholarDigital Library
- Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollá r, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6--12, 2014, Proceedings, Part V, volume 8693 of Lecture Notes in Computer Science, pages 740--755. Springer, 2014.Google ScholarCross Ref
Index Terms
- MASS: Multi-edge Assisted Fast Object Detection for Autonomous Mobile Vision in Heterogeneous Edge Networks
Recommendations
Modelling Task Offloading Mobile Edge Computing
ICCDE '22: Proceedings of the 2022 8th International Conference on Computing and Data EngineeringWith the rapid growth of mobile devices (such as smart phones and IoT devices) and the upcoming 5G era, it has been considered that edge computing will play a significant role, which together with the Cloud server forms the Mobile Edge Computing (MEC) ...
A deep reinforcement learning assisted task offloading and resource allocation approach towards self-driving object detection
AbstractWith the development of communication technology and mobile edge computing (MEC), self-driving has received more and more research interests. However, most object detection tasks for self-driving vehicles are still performed at vehicle terminals, ...
Task Offloading with Task Classification and Offloading Nodes Selection for MEC-Enabled IoV
The Mobile Edge Computing (MEC)-based task offloading in the Internet of Vehicles (IoV) scenario, which transfers computational tasks to mobile edge nodes and fixed edge nodes with available computing resources, has attracted interest in recent years. The ...
Comments