skip to main content
10.1145/3396868.3402499acmconferencesArticle/Chapter ViewAbstractPublication PageshealthdlConference Proceedingsconference-collections
invited-talk

Towards Ultra-Efficient DNN Inference Acceleration on Edge Devices for Wellbeing Applications

Published: 19 June 2020 Publication History

Abstract

Various Deep Neural Networks (DNN) have served as the fundamental building blocks of a broad spectrum of machine learning applications due to its superior performance. However, it is not an easy task to apply or deploy deep learning techniques on the rapidly increasing edge devices such as wearable Internet of Things (IoT), smartphones or smart health devices with embedded sensors due to the limited computation and memory resources. It is desirable to develop efficient systems design and algorithms for the wide deployment of deep learning inferences on edge devices. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on edge devices. The structured model pruning is adopted to satisfy the limited computation and memory resource constraints and enable potential hardware acceleration. In the meantime, the compile optimization is utilized to further implement superior DNN inference acceleration performance on edge devices. With the proposed techniques, we are able to achieve real-time DNN inferences on edge devices as shown in the demo with various DNN applications deployed on mobile devices. These techniques enable us to explore impactful solutions with deep learning algorithms on cheaper affordable wearable IoTs to help in the well beings of users. Specifically, better personalization of health related solutions can help to care for the health and enhance users' experience with the superior performance of deep learning on smart health devices.

Cited By

View all
  • (2023)RLAlloc: A Deep Reinforcement Learning-Assisted Resource Allocation Framework for Enhanced Both I/O Throughput and QoS Performance of Multi-Streamed SSDs2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247988(1-6)Online publication date: 9-Jul-2023
  • (2022)Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit ApproachIEEE Internet of Things Journal10.1109/JIOT.2022.31827289:21(21409-21420)Online publication date: 1-Nov-2022
  • (2022)Edge Deployment Framework of GuardBot for Optimized Face Mask Recognition With Real-Time Inference Using Deep LearningIEEE Access10.1109/ACCESS.2022.319053810(77898-77921)Online publication date: 2022

Index Terms

  1. Towards Ultra-Efficient DNN Inference Acceleration on Edge Devices for Wellbeing Applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HealthDL'20: Proceedings of Deep Learning for Wellbeing Applications Leveraging Mobile Devices and Edge Computing
    June 2020
    22 pages
    ISBN:9781450380126
    DOI:10.1145/3396868
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 June 2020

    Check for updates

    Author Tags

    1. Compiler
    2. Deep Neural Networks
    3. Machine Learning
    4. Mobile Acceleration
    5. Model Compression

    Qualifiers

    • Invited-talk
    • Research
    • Refereed limited

    Conference

    MobiSys '20
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)RLAlloc: A Deep Reinforcement Learning-Assisted Resource Allocation Framework for Enhanced Both I/O Throughput and QoS Performance of Multi-Streamed SSDs2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247988(1-6)Online publication date: 9-Jul-2023
    • (2022)Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit ApproachIEEE Internet of Things Journal10.1109/JIOT.2022.31827289:21(21409-21420)Online publication date: 1-Nov-2022
    • (2022)Edge Deployment Framework of GuardBot for Optimized Face Mask Recognition With Real-Time Inference Using Deep LearningIEEE Access10.1109/ACCESS.2022.319053810(77898-77921)Online publication date: 2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media