skip to main content
10.1145/3345768.3355917acmconferencesArticle/Chapter ViewAbstractPublication PagesmswimConference Proceedingsconference-collections
research-article
Best Paper

SEE: Scheduling Early Exit for Mobile DNN Inference during Service Outage

Published:25 November 2019Publication History

ABSTRACT

In recent years, the rapid development of edge computing enables us to process a wide variety of intelligent applications at the edge, such as real-time video analytics. However, edge computing could suffer from service outage caused by the fluctuated wireless connection or congested computing resource. During the service outage, the only choice is to process the deep neural network (DNN) inference at the local mobile devices. The obstacle is that due to the limited resource, it may not be possible to complete inference tasks on time. Inspired by the recently developedearly exit of DNNs, where we can exit DNN at earlier layers to shorten the inference delay by sacrificing an acceptable level of accuracy, we propose to adopt such mechanism to process inference tasks during the service outage. The challenge is how to obtain the optimal schedule with diverse early exit choices. To this end, we formulate an optimal scheduling problem with the objective to maximize a general overall utility. However, the problem is in the form of integer programming, which cannot be solved by a standard approach. We therefore prove the Ordered Scheduling structure, indicating that a frame arrived earlier must be scheduled earlier. Such structure greatly decreases the searching space for an optimal solution. Then, we propose the Scheduling Early Exit (SEE) algorithm based on dynamic programming, to solve the problem optimally with polynomial computational complexity. Finally, we conduct trace-driven simulations and compare SEE with two benchmarks. The result shows that SEE can outperform the benchmarks by 50.9%.

References

  1. 2017. LTE upload speed super slow everywhere. Retrieved May 31, 2019 from https://community.verizonwireless.com/t5/iPhone-X-Xr-Xs/LTE-uploadspeed- super-slow-everywhere/td-p/1026178Google ScholarGoogle Scholar
  2. 2019. Recommended upload encoding settings - YouTube Help. Retrieved May 31, 2019 from https://support.google.com/youtube/answer/1722171?hl=enGoogle ScholarGoogle Scholar
  3. N. Abbas, Y. Zhang, A. Taherkordi, and T. Skeie. 2018. Mobile Edge Computing: A Survey. IEEE Internet of Things Journal 5, 1 (Feb. 2018), pp. 450--465.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Almeida, V. Almeida, D. Ardagna, Í. Cunha, C. Francalanci, and M. Trubian. 2010. Joint admission control and resource allocation in virtualized servers. J. Parallel and Distrib. Comput. 70, 4 (2010), 344 -- 362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Bao, D. Yuan, Z. Yang, S. Wang, B. Zhou, S. Adams, and A. Zomaya. Oct. 2018. sFog: Seamless Fog Computing Environment for Mobile IoT Applications. In Proceedings of ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWIM). Montreal, Canada.Google ScholarGoogle Scholar
  6. S. Bhattacharya and Nicholas D. Lane. Nov. 2016. Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables. In Proceedings of ACM Conference on Embedded Network Sensor Systems (SenSys). Stanford, CA, USA.Google ScholarGoogle Scholar
  7. T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama. Aug. 2017. Adaptive Neural Networks for Efficient Inference. In Proceedings of International Conference on Machine Learning (ICML). Sydney, NSW, Australia.Google ScholarGoogle Scholar
  8. J. Chen and X. Ran. 2019. Deep Learning With Edge Computing: A Review. Proc. IEEE 107, 8 (Aug 2019), 1655--1674.Google ScholarGoogle ScholarCross RefCross Ref
  9. M. Chen and Y. Hao. 2018. Task Offloading for Mobile Edge Computing in Software Defined Ultra-Dense Network. IEEE Journal on Selected Areas in Communications 36, 3 (Mar. 2018), pp. 587--597.Google ScholarGoogle ScholarCross RefCross Ref
  10. X. Chen, L. Jiao,W. Li, and X. Fu. 2016. Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing. IEEE/ACM Transactions on Networking 24, 5 (Oct. 2016), pp. 2795--2808.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Fang, D. Hong, and R. K. Gupta. Jun. 2019. Serving Deep Neural Networks at the Cloud Edge for Vision Applications on Mobile Platforms (MMSys). Amherst, MA, USA.Google ScholarGoogle Scholar
  12. K. Ha, Y. Abe, T. Eiszler, Z. Chen, W. Hu, B. Amos, R. Upadhyaya, P. Pillai, and M. Satyanarayanan. April. 2017. You Can Teach Elephants to Dance: Agile VM Handoff for Edge Computing. In Proceedings of ACM/IEEE Symposium on Edge Computing (SEC). San Jose/Fremont, CA, USA.Google ScholarGoogle Scholar
  13. K. He, X. Zhang, S. Ren, and J. Sun. Jun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, USA.Google ScholarGoogle Scholar
  14. C. Hu, W. Bao, D. Wang, and F. Liu. Apr.-May. 2019. Dynamic Adaptive DNN Surgery for Inference Acceleration on the Edge. In Proceedings of IEEE International Conference on Computer Communications (INFOCOM). Paris, France.Google ScholarGoogle Scholar
  15. Y. Kang, J. Hauswald, C Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang. Apr. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Xi'an, China.Google ScholarGoogle Scholar
  16. Y. Kim, J. Kim, D. Chae, D. Kim, and J. Kim. Mar. 2019. ?Layer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor- Friendly Quantization. In Proceedings of European Conference on Computer Systems (EuroSys). Dresden, Germany.Google ScholarGoogle Scholar
  17. Alex Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc.Google ScholarGoogle Scholar
  18. K. Kumar and Y. Lu. 2010. Cloud Computing for Mobile Users: Can Offloading Computation Save Energy? Computer 43, 4 (Apr. 2010), pp. 51--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Liu, H. Li, and M. Gruteser. Oct. 2019. Edge assisted real-time object detection for mobile augmented reality. In Proceedings of Annual International Conference on Mobile Computing and Networking (MobiCom). Los Cabos, Mexico.Google ScholarGoogle Scholar
  20. J. Luo, J. Wu, and W. Lin. Oct. 2017. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. In Proceedings of IEEE International Conference on Computer Vision (ICCV). Venice, Italy.Google ScholarGoogle Scholar
  21. L. Ma, S. Yi, and Q. Li. 2017. Efficient Service Handoff Across Edge Servers via Docker Container Migration. In Proceedings of ACM/IEEE Symposium on Edge Computing (SEC). San Jose/Fremont, CA, USA.Google ScholarGoogle Scholar
  22. P. Mach and Z. Becvar. 2017. Mobile Edge Computing: A Survey on Architecture and Computation Offloading. IEEE Communications Surveys Tutorials 19, 3 (Mar. 2017), pp. 1628--1656.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief. 2017. A Survey on Mobile Edge Computing: The Communication Perspective. IEEE Communications Surveys Tutorials 19, 4 (Aug. 2017), pp. 2322--2358.Google ScholarGoogle ScholarCross RefCross Ref
  24. MATLAB. [n. d.]. Piecewise Cubic Hermite Interpolating Polynomial (PCHIP). Retrieved May 31, 2019 from https://au.mathworks.com/help/matlab/ref/pchip. html#ReferencesGoogle ScholarGoogle Scholar
  25. C. Pei, Z. Wang, Y. Zhao, Z. Wang, Y. Meng, D. Pei, Y. Peng, W. Tang, and X. Qu. May. 2017. Why it takes so long to connect to a WiFi access point. In Proceedings of IEEE International Conference on Computer Communications (INFOCOM). Atlanta, GA, USA.Google ScholarGoogle Scholar
  26. D. Satria, D. Park, and M. Jo. 2017. Recovery for overloaded mobile edge computing. Future Generation Computer Systems 70 (May. 2017), pp. 138 -- 147.Google ScholarGoogle Scholar
  27. V. Sindhwani, T. N. Sainath, and S. Kumar. 2015. Structured Transforms for Small-Footprint Deep Learning. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc.Google ScholarGoogle Scholar
  28. M. Sun, D. Snyder, Y. Gao, V. Nagaraja, M. Rodehorst, S. Panchapagesan, N. Strom, S. Matsoukas, and S. Vitaladevuni. Aug. 2017. Compressed Time Delay Neural Network for Small-Footprint Keyword Spotting. In Proceedings of Annual Conference of the International Speech Communication Association (INTERSPEECH). Stockholm, Sweden.Google ScholarGoogle Scholar
  29. S. Teerapittayanon, B. McDanel, and HT. Kung. Dec. 2016. Branchynet: Fast inference via early exiting from deep neural networks. In Proceedings of International Conference on Pattern Recognition (ICPR). Cancun, Mexifco.Google ScholarGoogle Scholar
  30. S. Teerapittayanon, B. McDanel, and H. T. Kung. Jun. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. In Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS). Atlanta, GA, USA.Google ScholarGoogle Scholar
  31. X. Wang, F. Yu, Z.Y. Dou, T. Darrell, and J. E. Gonzalez. Sep. 2018. SkipNet: Learning Dynamic Routing in Convolutional Networks. In Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany.Google ScholarGoogle Scholar
  32. Q. Xia, W. Liang, and W. Xu. Oct. 2013. Throughput maximization for online request admissions in mobile cloudlets. In Proceedings of Annual IEEE Conference on Local Computer Networks. Sydney, NSW, Australia.Google ScholarGoogle Scholar
  33. M. Xu, F. Qian, M. Zhu, F. Huang, S. Pushp, and X. Liu. 2019. DeepWear: Adaptive Local Offloading for On-Wearable Deep Learning. IEEE Transactions on Mobile Computing (2019).Google ScholarGoogle Scholar
  34. Y. Zhang, D. Niyato, and P. Wang. 2015. Offloading in Mobile Cloudlet Systems with Intermittent Connectivity. IEEE Transactions on Mobile Computing 14, 12 (Dec. 2015), pp. 2516--2529.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang. 2019. Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing. CoRR abs/1905.10083 (2019). arXiv:1905.10083Google ScholarGoogle Scholar

Index Terms

  1. SEE: Scheduling Early Exit for Mobile DNN Inference during Service Outage

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MSWIM '19: Proceedings of the 22nd International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems
      November 2019
      340 pages
      ISBN:9781450369046
      DOI:10.1145/3345768

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 November 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate398of1,577submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader