Abstract:
Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to ...Show MoreMetadata
Abstract:
Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.
Published in: 2020 International Conference on Information and Communication Technology Convergence (ICTC)
Date of Conference: 21-23 October 2020
Date Added to IEEE Xplore: 21 December 2020
ISBN Information:
Print on Demand(PoD) ISSN: 2162-1233