poster

Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems

Authors:

Ling LiuAuthors Info & Claims

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

Pages 302 - 305

https://doi.org/10.1145/3543873.3587370

Published: 30 April 2023 Publication History

Get Access

Abstract

Edge computing and cloud computing have been utilized in many AI applications in various fields, such as computer vision, NLP, autonomous driving, and smart cities. To benefit from the advantages of both paradigms, we introduce HiDEC, a hierarchical deep neural network (DNN) inference framework with three novel features. First, HiDEC enables the training of a resource-adaptive DNN through the injection of multiple early exits. Second, HiDEC provides a latency-aware inference scheduler, which determines which input samples should exit locally on an edge device based on the exit scores, enabling inference on edge devices with insufficient resources to run the full model. Third, we introduce a dual thresholding approach allowing both easy and difficult samples to exit early. Our experiments on image and text classification benchmarks show that HiDEC significantly outperforms existing solutions.

References

[1]

Jiashen Cao, Karan Sarkar, Ramyad Hadidi, Joy Arulraj, and Hyesoon Kim. 2022. FiGO: Fine-Grained Query Optimization in Video Analytics. In Proceedings of the 2022 International Conference on Management of Data. 14 pages.

Digital Library

Google Scholar

[2]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255.

Crossref

Google Scholar

[3]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1.

Google Scholar

[4]

Sayan Ghosh, Karthik Prasad, Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Graham Cormode, and Peter Vajda. 2022. Pruning Compact ConvNets For Efficient Inference.

Google Scholar

[5]

Song Han, Huizi Mao, and William J. Dally. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In 4th International Conference on Learning Representations, ICLR 2016.

Google Scholar

[6]

Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770–778.

Google Scholar

[7]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. https://doi.org/10.48550/ARXIV.1503.02531

Crossref

Google Scholar

[8]

Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations.

Google Scholar

[9]

Stefanos Laskaridis, Alexandros Kouris, and Nicholas D. Lane. 2021. Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning (Virtual, WI, USA) (EMDL’21). New York, NY, USA, 1–6.

Digital Library

Google Scholar

[10]

Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, and Gao Huang. 2019. Improved Techniques for Training Adaptive Deep Networks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019), 1891–1900.

Google Scholar

[11]

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.

Google Scholar

[12]

Surat Teerapittayanon, Bradley McDanel, and H.T. Kung. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

Google Scholar

[13]

Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level CNNs for Text Classification. In Advances in Neural Information Processing Systems.

Google Scholar

[14]

Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, and Furu Wei. 2020. BERT Loses Patience: Fast and Robust Inference with Early Exit. In Advances in Neural Information Processing Systems.

Google Scholar

Cited By

View all

Rahmath P HSrivastava VChaurasia KPacheco RCouto R(2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3698767
Tang SYu YWang HWang GChen WXu ZGuo SGao W(2023)A Survey on Scheduling Techniques in Computing and Network ConvergenceIEEE Communications Surveys & Tutorials10.1109/COMST.2023.332902726:1(160-195)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1109/COMST.2023.3329027

Index Terms

Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence

Recommendations

Weightless Neural Networks for Efficient Edge Inference
PACT '22: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques

Weightless neural networks (WNNs) are a class of machine learning model which use table lookups to perform inference, rather than the multiply-accumulate operations typical of deep neural networks (DNNs). Individual weightless neurons are capable of ...
Distributing deep learning inference on edge devices
CoNEXT '20: Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies

Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) are widely used in IoT related applications. However, inferencing pre-trained large DNNs and CNNs consumes a significant amount of time, memory and computational resources. This makes ...
Edge-preserving image denoising using a deep convolutional neural network
Highlights
- This paper makes use of a deep CNN for image denoising.
- The network is trained ...
Abstract
This paper introduces a novel denoising approach making use of a deep convolutional neural network to preserve image edges. The network is trained by using the edge map obtained from the well-known Canny algorithm and aims at ...

Comments

Information & Contributors

Information

Published In

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

April 2023

1567 pages

ISBN:9781450394192

DOI:10.1145/3543873

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Check for updates

Author Tags

Qualifiers

Poster
Research
Refereed limited

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
222
Total Downloads

Downloads (Last 12 months)60
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Rahmath P HSrivastava VChaurasia KPacheco RCouto R(2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3698767
Tang SYu YWang HWang GChen WXu ZGuo SGao W(2023)A Survey on Scheduling Techniques in Computing and Network ConvergenceIEEE Communications Surveys & Tutorials10.1109/COMST.2023.332902726:1(160-195)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1109/COMST.2023.3329027

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Cited By

Index Terms

Recommendations

Weightless Neural Networks for Efficient Edge Inference

Distributing deep learning inference on edge devices

Edge-preserving image denoising using a deep convolutional neural network

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations