skip to main content
10.1145/3570361.3592514acmconferencesArticle/Chapter ViewAbstractPublication PagesmobicomConference Proceedingsconference-collections
research-article

Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios

Published:10 July 2023Publication History

ABSTRACT

While a number of recent efforts have explored the use of "cloud offload" to enable deep learning on IoT devices, these have not assumed the use of duty-cycled radios like BLE. We argue that radio duty-cycling significantly diminishes the performance of existing cloud-offload methods. We tackle this problem by leveraging a previously unexplored opportunity to use early-exit offload enhanced with prioritized communication, dynamic pooling, and dynamic fusion of features. We show that our system, FLEET, achieves significant benefits in accuracy, latency, and compute budget compared to state-of-art local early exit, remote processing, and model partitioning schemes across a range of DNN models, datasets, and IoT platforms.

References

  1. Arm cortex-a77. https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a77.Google ScholarGoogle Scholar
  2. Arm cortex-m33. https://developer.arm.com/Processors/Cortex-M33.Google ScholarGoogle Scholar
  3. Maximize ble throughput. https://punchthrough.com/maximizing-ble-throughput-on-ios-and-android/.Google ScholarGoogle Scholar
  4. nrf online power profiler. https://devzone.nordicsemi.com/power/w/opp.Google ScholarGoogle Scholar
  5. Hyomin Choi and Ivan V Bajić. Deep feature compression for collaborative object detection. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 3743--3747. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  6. Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.Google ScholarGoogle Scholar
  7. Robert A Cohen, Hyomin Choi, and Ivan V Bajić. Lightweight compression of neural network feature tensors for collaborative intelligence. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pages 1--6. IEEE, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248--255, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xibin Dong, Zhiwen Yu, Wenming Cao, Yifan Shi, and Qianli Ma. A survey on ensemble learning. Frontiers of Computer Science, 14(2):241--258, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Amir Erfan Eshratifar, Mohammad Saeed Abrishami, and Massoud Pedram. Jointdnn: An efficient training and inference engine for intelligent mobile cloud computing services. IEEE Transactions on Mobile Computing, 2019.Google ScholarGoogle Scholar
  11. Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.Google ScholarGoogle Scholar
  12. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  13. Himax WE-I Plus EVB Endpoint AI Development Board. https://www.sparkfun.com/products/17256.Google ScholarGoogle Scholar
  14. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.Google ScholarGoogle Scholar
  15. Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1314--1324, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jian Huang, Anirudh Badam, Ranveer Chandra, and Edmund B. Nightingale. Weardrive: Fast and energy-efficient storage for wearables. In 2015 USENIX Annual Technical Conference (USENIX ATC 15), pages 613--625, Santa Clara, CA, July 2015. USENIX Association.Google ScholarGoogle Scholar
  17. Jin Huang, Colin Samplawski, Deepak Ganesan, Benjamin Marlin, and Heesung Kwon. Clio: Enabling automatic compilation of deep learning pipelines across iot and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, pages 1--12, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sohei Itahara, Takayuki Nishio, and Koji Yamamoto. Packet-loss-tolerant split inference for delay-sensitive deep learning in lossy wireless networks. arXiv preprint arXiv:2104.13629, 2021.Google ScholarGoogle Scholar
  19. Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2704--2713, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  20. Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News, 45(1):615--629, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yigitcan Kaya, Sanghyun Hong, and Tudor Dumitras. Shallow-deep networks: Understanding and mitigating network overthinking. In International Conference on Machine Learning, pages 3301--3310. PMLR, 2019.Google ScholarGoogle Scholar
  22. Jong Hwan Ko, Taesik Na, Mohammad Faisal Amir, and Saibal Mukhopadhyay. Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pages 1--6. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  23. Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-100 (canadian institute for advanced research).Google ScholarGoogle Scholar
  24. Liangzhen Lai and Naveen Suda. Enabling deep learning at the lot edge. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1--6. IEEE, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Stefanos Laskaridis, Stylianos I Venieris, Mario Almeida, Ilias Leontiadis, and Nicholas D Lane. Spinn: synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, pages 1--15, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ilias Leontiadis, Stefanos Laskaridis, Stylianos I Venieris, and Nicholas D Lane. It's always personal: Using early exits for efficient on-device cnn personalisation. In Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications, pages 15--21, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Guangli Li, Lei Liu, Xueying Wang, Xiao Dong, Peng Zhao, and Xiaobing Feng. Auto-tuning neural network quantization framework for collaborative inference between the cloud and edge. In International Conference on Artificial Neural Networks, pages 402--411. Springer, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  28. Jinyu Li, Rui Zhao, Jui-Ting Huang, and Yifan Gong. Learning small-size dnn with output-distribution-based criteria. In Fifteenth annual conference of the international speech communication association, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  29. Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin, Kurt Keutzer, Dan Klein, and Joey Gonzalez. Train big, then compress: Rethinking model size for efficient training and inference of transformers. In International Conference on Machine Learning, pages 5958--5968. PMLR, 2020.Google ScholarGoogle Scholar
  30. Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, and Song Han. Memory-efficient patch-based inference for tiny deep learning. Advances in Neural Information Processing Systems, 34, 2021.Google ScholarGoogle Scholar
  31. Sicong Liu, Bin Guo, Ke Ma, Zhiwen Yu, and Junzhao Du. Adaspring: Context-adaptive and runtime-evolutionary deep model compression for mobile applications. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(1):1--22, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zihao Liu, Tao Liu, Wujie Wen, Lei Jiang, Jie Xu, Yanzhi Wang, and Gang Quan. Deepn-jpeg: a deep neural network favorable jpeg-based image compression framework. In Proceedings of the 55th Annual Design Automation Conference, pages 1--6, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Arnab Neelim Mazumder, Jian Meng, Hasib-Al Rashid, Utteja Kallakuri, Xin Zhang, Jae-sun Seo, and Tinoosh Mohsenin. A survey on the optimization of neural network accelerators for micro-ai on-device inference. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  34. microTVM: TVM on bare-metal. https://tvm.apache.org/docs/topic/microtvm/index.html.Google ScholarGoogle Scholar
  35. Raspberry Pi. Raspberry pi 4 model b. online].(https://www.raspberrypi.org, 2015.Google ScholarGoogle Scholar
  36. S32R2X: Microcontrollers for High-Performance Radar. https://www.nxp.com/products/processors-and-microcontrollers/power-architecture/s32r-radar-mcus/s32r26-and-s32r27-microcontrollers-for-high-performance-radar:S32R2X.Google ScholarGoogle Scholar
  37. Wenqi Shi, Yunzhong Hou, Sheng Zhou, Zhisheng Niu, Yang Zhang, and Lu Geng. Improving device-edge cooperative inference of deep learning via 2-step pruning. arXiv preprint arXiv:1903.03472, 2019.Google ScholarGoogle Scholar
  38. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818--2826, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  39. Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. Branchynet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 2464--2469. IEEE, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  40. Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pages 328--339. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  41. Vishal Varun Tipparaju, Kyle R Mallires, Di Wang, Francis Tsow, and Xiaojun Xian. Mitigation of data packet loss in bluetooth low energy-based wearable healthcare ecosystem. Biosensors, 11(10):350, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  42. https://greenwaves-technologies.com/gap8-product/. GAP8: Ultra-low power, always-on processor for embedded artificial intelligence.Google ScholarGoogle Scholar
  43. Yue Wang, Jianghao Shen, Ting-Kuei Hu, Pengfei Xu, Tan Nguyen, Richard Baraniuk, Zhangyang Wang, and Yingyan Lin. Dual dynamic inference: Enabling more efficient, adaptive, and controllable deep inference. IEEE Journal of Selected Topics in Signal Processing, 14(4):623--633, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  44. Pete Warden and Daniel Situnayake. Tinyml: Machine learning with tensorflow lite on arduino and ultra-low-power microcontrollers. O'Reilly Media, 2019.Google ScholarGoogle Scholar

Index Terms

  1. Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and Networking
          October 2023
          1605 pages
          ISBN:9781450399906
          DOI:10.1145/3570361

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 July 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate440of2,972submissions,15%
        • Article Metrics

          • Downloads (Last 12 months)673
          • Downloads (Last 6 weeks)65

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader