skip to main content
10.1145/3543873.3587370acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems

Published: 30 April 2023 Publication History

Abstract

Edge computing and cloud computing have been utilized in many AI applications in various fields, such as computer vision, NLP, autonomous driving, and smart cities. To benefit from the advantages of both paradigms, we introduce HiDEC, a hierarchical deep neural network (DNN) inference framework with three novel features. First, HiDEC enables the training of a resource-adaptive DNN through the injection of multiple early exits. Second, HiDEC provides a latency-aware inference scheduler, which determines which input samples should exit locally on an edge device based on the exit scores, enabling inference on edge devices with insufficient resources to run the full model. Third, we introduce a dual thresholding approach allowing both easy and difficult samples to exit early. Our experiments on image and text classification benchmarks show that HiDEC significantly outperforms existing solutions.

References

[1]
Jiashen Cao, Karan Sarkar, Ramyad Hadidi, Joy Arulraj, and Hyesoon Kim. 2022. FiGO: Fine-Grained Query Optimization in Video Analytics. In Proceedings of the 2022 International Conference on Management of Data. 14 pages.
[2]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255.
[3]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1.
[4]
Sayan Ghosh, Karthik Prasad, Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Graham Cormode, and Peter Vajda. 2022. Pruning Compact ConvNets For Efficient Inference.
[5]
Song Han, Huizi Mao, and William J. Dally. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In 4th International Conference on Learning Representations, ICLR 2016.
[6]
Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770–778.
[7]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. https://doi.org/10.48550/ARXIV.1503.02531
[8]
Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations.
[9]
Stefanos Laskaridis, Alexandros Kouris, and Nicholas D. Lane. 2021. Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning (Virtual, WI, USA) (EMDL’21). New York, NY, USA, 1–6.
[10]
Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, and Gao Huang. 2019. Improved Techniques for Training Adaptive Deep Networks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019), 1891–1900.
[11]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.
[12]
Surat Teerapittayanon, Bradley McDanel, and H.T. Kung. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[13]
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level CNNs for Text Classification. In Advances in Neural Information Processing Systems.
[14]
Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, and Furu Wei. 2020. BERT Loses Patience: Fast and Robust Inference with Early Exit. In Advances in Neural Information Processing Systems.

Cited By

View all
  • (2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
  • (2023)A Survey on Scheduling Techniques in Computing and Network ConvergenceIEEE Communications Surveys & Tutorials10.1109/COMST.2023.332902726:1(160-195)Online publication date: 1-Nov-2023

Index Terms

  1. Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
    April 2023
    1567 pages
    ISBN:9781450394192
    DOI:10.1145/3543873
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 April 2023

    Check for updates

    Author Tags

    1. distributed systems
    2. edge computing
    3. neural networks

    Qualifiers

    • Poster
    • Research
    • Refereed limited

    Conference

    WWW '23
    Sponsor:
    WWW '23: The ACM Web Conference 2023
    April 30 - May 4, 2023
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)60
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 25 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
    • (2023)A Survey on Scheduling Techniques in Computing and Network ConvergenceIEEE Communications Surveys & Tutorials10.1109/COMST.2023.332902726:1(160-195)Online publication date: 1-Nov-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media