skip to main content
10.1145/3331052.3332477acmconferencesArticle/Chapter ViewAbstractPublication PagesmobihocConference Proceedingsconference-collections
research-article

Distributed Deep Neural Network Deployment for Smart Devices from the Edge to the Cloud

Published:02 July 2019Publication History

ABSTRACT

Traditionally, deep learning acceleration mostly focuses on the trade-off between accuracy and training time but seldom addresses the deployment over hierarchical 5G networks to maximize the inference throughput. By contrast, computing offloading research emphasizes whether to offload the tasks to the cloud to reduce computing time and achieve a lower response time, and thus, the optimal deployment to maximize throughput has not been explored. In this paper, we explore Distributed Deep Neural Network Deployment Problem with Constrained Completion Time (TREND-WANT) to solve the deployment problem considering both response time and inference throughput. Due to the intractability of TREND-WANT, we first design a new algorithm, named Stage-Time-Aware Layer Deployment Algorithm (STEED), to maximize the throughput. Afterward, an extension termed STEED with Adaptable Completion Time (STEED-ADAPT) is developed to tailor the solution to achieve a lower responsible time. Simulation results manifest our algorithms outperform the traditional methods by at least 200%.

References

  1. 2019. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update. White paper.Google ScholarGoogle Scholar
  2. 2019. Distributed Deep Neural Network Deployment for smart devices from the Edge to the Cloud (full). https://1drv.ms/b/s!AjOC7YQ-QPindxJGM4oiRmPE6ZUGoogle ScholarGoogle Scholar
  3. Jeffrey Dean et al. 2012. Large Scale Distributed Deep Networks. In NIPS.Google ScholarGoogle Scholar
  4. Forrest N. Iandola et al. 2016. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters. In IEEE CVPR.Google ScholarGoogle Scholar
  5. Yiping Kang et al. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In ACM ASPLOS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Andrew Zisserman Karen Simonyan. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv: 1409.1556 (2014).Google ScholarGoogle Scholar
  7. J. Mao, Z. Qin, et al. 2017. AdaLearner: An adaptive distributed mobile learning system for neural networks. In IEEE/ACM ICCAD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cristina Marquez et al. 2018. How Should I Slice My Network? A Multi-Service Empirical Evaluation of Resource Sharing Efficiency. In ACM MOBICOM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X. Ran, H. Chen, X. Zhu, Z. Liu, and J. Chen. 2018. Deep-Decision: A Mobile Deep Learning Framework for Edge Video Analytics. In IEEE INFOCOM.Google ScholarGoogle Scholar
  10. J. Redmon et al. 2016. You Only Look Once: Unified, Real-Time Object Detection. In CVPR.Google ScholarGoogle Scholar
  11. T. G. Rodrigues et al. 2017. Hybrid Method for Minimizing Service Delay in Edge Cloud Computing Through VM Migration and Transmission Power Control. IEEE Trans. Comput. 66 (2017), 810--819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Rost et al. 2016. Mobile network architecture evolution toward 5G. IEEE Commun. Mag. 54 (2016), 84--91.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. Teerapittayanon, B. McDanel, and H. T. Kung. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. In IEEE ICDCS.Google ScholarGoogle Scholar
  14. J. Wang et al. 2018. Deep Learning towards Mobile Applications. In IEEE ICDCS.Google ScholarGoogle Scholar
  15. L. Wang et al. 2018. Service Entity Placement for Social Virtual Reality Applications in Edge Computing. In IEEE INFOCOM.Google ScholarGoogle Scholar
  16. S. Wang et al. 2018. When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning. In IEEE INFOCOM.Google ScholarGoogle Scholar
  17. J. Xu, L. Chen, and P. Zhou. 2018. Joint Service Caching and Task Offloading for Mobile Edge Computing in Dense Networks. In IEEE INFOCOM.Google ScholarGoogle Scholar

Index Terms

  1. Distributed Deep Neural Network Deployment for Smart Devices from the Edge to the Cloud

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PERSIST-IoT '19: Proceedings of the ACM MobiHoc Workshop on Pervasive Systems in the IoT Era
            July 2019
            64 pages
            ISBN:9781450368056
            DOI:10.1145/3331052

            Copyright © 2019 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 2 July 2019

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader