Journals & Magazines >IEEE Transactions on Services... >Volume: 15 Issue: 2

Latency-Driven Model Placement for Efficient Edge Intelligence Service

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Deep learning services based on cloud computing have deficiencies in latency, privacy, etc. To meet the requirements of low latency, researchers have begun to consider th...Show More

Metadata

Abstract:

Deep learning services based on cloud computing have deficiencies in latency, privacy, etc. To meet the requirements of low latency, researchers have begun to consider the deployment of deep learning services in edges, i.e., edge intelligence service. Deploying deep learning models on multiple processors or devices so that the computation of a deep learning model can be conducted in parallel is a possible solution to improve the efficiency of edge intelligence services. In this article, we propose a novel latency-driven deep learning model placement method for efficient edge intelligence service. Model placement contains two procedures: model partition and sub-models assignment. In our method, we first convert the model into execution graphs and propose a novel latency-driven multilevel graph partition for the model. Then the partitioned sub-models are heuristically assigned to available processors. To the best of our knowledge, it is the first work that proposes latency-driven graph partition algorithms for model placement. Extensive experiments on several commonly used DNN (deep neural network) models and synthetic datasets show that our method can achieve the lowest execution latency with low complexity compared with other state-of-the-art model placement methods.

Published in: IEEE Transactions on Services Computing ( Volume: 15, Issue: 2, 01 March-April 2022)

Page(s): 591 - 601

Date of Publication: 01 September 2021

ISSN Information:

DOI: 10.1109/TSC.2021.3109094

Funding Agency:

Contents

References is not available for this document.

Latency-Driven Model Placement for Efficient Edge Intelligence Service

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Latency-Driven Model Placement for Efficient Edge Intelligence Service

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?