Loading [a11y]/accessibility-menu.js
Evaluating the Performance of Deep Learning Inference Service on Edge Platform | IEEE Conference Publication | IEEE Xplore

Evaluating the Performance of Deep Learning Inference Service on Edge Platform


Abstract:

Deep learning inference requires tremendous amount of computation and typically is offloaded the cloud for execution. Recently, edge computing, which processes and stores...Show More

Abstract:

Deep learning inference requires tremendous amount of computation and typically is offloaded the cloud for execution. Recently, edge computing, which processes and stores data at the edge of the Internet closest to the mobile devices or sensors, has been considered as new computing paradigm. We have studied the performance of the deep neural network (DNN) inference service based on different configurations of resources assigned to a container. In this work, we measured and analyzed a real-world edge service on containerization platform. An edge service is named A!Eye, an application with various DNN inferences. The edge service has both CPU-friendly and GPU-friendly tasks. CPU tasks account for more than half of the latency of the edge service. Our analyses reveal interesting findings about running the DNN inference service on the container-based execution platform; (a) The latency of DNN inference-based edge services is affected by CPU-based operation performance. (b) Pinning CPUs can reduce the latency of an edge service. (c) In order to improve the performance of an edge service, it is very important to avoid PCIe bottleneck shared by resources like CPUs, GPUs and NICs.
Date of Conference: 20-22 October 2021
Date Added to IEEE Xplore: 07 December 2021
ISBN Information:
Print on Demand(PoD) ISSN: 2162-1233
Conference Location: Jeju Island, Korea, Republic of

Funding Agency:


References

References is not available for this document.