invited-talk

Towards Ultra-Efficient DNN Inference Acceleration on Edge Devices for Wellbeing Applications

Author:

Yanzhi WangAuthors Info & Claims

HealthDL'20: Proceedings of Deep Learning for Wellbeing Applications Leveraging Mobile Devices and Edge Computing

Page 17

https://doi.org/10.1145/3396868.3402499

Published: 19 June 2020 Publication History

Get Access

Abstract

Various Deep Neural Networks (DNN) have served as the fundamental building blocks of a broad spectrum of machine learning applications due to its superior performance. However, it is not an easy task to apply or deploy deep learning techniques on the rapidly increasing edge devices such as wearable Internet of Things (IoT), smartphones or smart health devices with embedded sensors due to the limited computation and memory resources. It is desirable to develop efficient systems design and algorithms for the wide deployment of deep learning inferences on edge devices. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on edge devices. The structured model pruning is adopted to satisfy the limited computation and memory resource constraints and enable potential hardware acceleration. In the meantime, the compile optimization is utilized to further implement superior DNN inference acceleration performance on edge devices. With the proposed techniques, we are able to achieve real-time DNN inferences on edge devices as shown in the demo with various DNN applications deployed on mobile devices. These techniques enable us to explore impactful solutions with deep learning algorithms on cheaper affordable wearable IoTs to help in the well beings of users. Specifically, better personalization of health related solutions can help to care for the health and enhance users' experience with the superior performance of deep learning on smart health devices.

Cited By

View all

Li MWu CGao CJi CLi K(2023)RLAlloc: A Deep Reinforcement Learning-Assisted Resource Allocation Framework for Enhanced Both I/O Throughput and QoS Performance of Multi-Streamed SSDs2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247988(1-6)Online publication date: 9-Jul-2023
https://doi.org/10.1109/DAC56929.2023.10247988
Lu BYang JXu JRen S(2022)Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit ApproachIEEE Internet of Things Journal10.1109/JIOT.2022.31827289:21(21409-21420)Online publication date: 1-Nov-2022
https://doi.org/10.1109/JIOT.2022.3182728
Manzoor SKim EJoo SBae SIn GJoo KChoi JKuc T(2022)Edge Deployment Framework of GuardBot for Optimized Face Mask Recognition With Real-Time Inference Using Deep LearningIEEE Access10.1109/ACCESS.2022.319053810(77898-77921)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3190538

Index Terms

Towards Ultra-Efficient DNN Inference Acceleration on Edge Devices for Wellbeing Applications
1. General and reference
  1. Document types
    1. General literature

Recommendations

An adaptive DNN inference acceleration framework with end–edge–cloud collaborative computing
Abstract
Deep Neural Networks (DNNs) based on intelligent applications have been intensively deployed on mobile devices. Unfortunately, resource-constrained mobile devices cannot meet stringent latency requirements due to a large amount of ...
Highlights
- An adaptive DNN inference acceleration framework is proposed to reduce DNN inference latency in the end–edge–cloud computing environment.
Mobile Sensing Through Deep Learning
Ph.D. Forum '17: Proceedings of the 2017 Workshop on MobiSys 2017 Ph.D. Forum

Today, mobile devices are equipped with powerful processors along with various on-device sensors. Over the past few years, deep learning has become the dominant approach in the field of machine learning due to its impressive performance. We envision ...
DNN Inference Acceleration with Partitioning and Early Exiting in Edge Computing
Wireless Algorithms, Systems, and Applications
Abstract
Recently, deep neural networks (DNNs) have been applied to most intelligent applications and deployed on different kinds of devices. However, DNN inference is resource-intensive. Especially, in edge computing, DNN inference demands to face the ...

Comments

Information & Contributors

Information

Published In

HealthDL'20: Proceedings of Deep Learning for Wellbeing Applications Leveraging Mobile Devices and Edge Computing

June 2020

22 pages

ISBN:9781450380126

DOI:10.1145/3396868

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2020

Check for updates

Author Tags

Qualifiers

Invited-talk
Research
Refereed limited

Conference

MobiSys '20

Sponsor:

SIGMOBILE

MobiSys '20: The 18th Annual International Conference on Mobile Systems, Applications, and Services

June 19, 2020

ON, Toronto, Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
137
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li MWu CGao CJi CLi K(2023)RLAlloc: A Deep Reinforcement Learning-Assisted Resource Allocation Framework for Enhanced Both I/O Throughput and QoS Performance of Multi-Streamed SSDs2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247988(1-6)Online publication date: 9-Jul-2023
https://doi.org/10.1109/DAC56929.2023.10247988
Lu BYang JXu JRen S(2022)Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit ApproachIEEE Internet of Things Journal10.1109/JIOT.2022.31827289:21(21409-21420)Online publication date: 1-Nov-2022
https://doi.org/10.1109/JIOT.2022.3182728
Manzoor SKim EJoo SBae SIn GJoo KChoi JKuc T(2022)Edge Deployment Framework of GuardBot for Optimized Face Mask Recognition With Real-Time Inference Using Deep LearningIEEE Access10.1109/ACCESS.2022.319053810(77898-77921)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3190538

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Cited By

Index Terms

Recommendations

An adaptive DNN inference acceleration framework with end–edge–cloud collaborative computing

Mobile Sensing Through Deep Learning

DNN Inference Acceleration with Partitioning and Early Exiting in Edge Computing

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations