research-article

Hardware-Aware NAS Framework with Layer Adaptive Scheduling on Embedded System

Authors:

Meng ZhangAuthors Info & Claims

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

Pages 798 - 805

https://doi.org/10.1145/3394885.3431536

Published: 29 January 2021 Publication History

Get Access

Abstract

Neural Architecture Search (NAS) has been proven to be an effective solution for building Deep Convolutional Neural Network (DCNN) models automatically. Subsequently, several hardware-aware NAS frameworks incorporate hardware latency into the search objectives to avoid the potential risk that the searched network cannot be deployed on target platforms. However, the mismatch between NAS and hardware persists due to the absent of rethinking the applicability of the searched network layer characteristics and hardware mapping. A convolution neural network layer can be executed on various dataflows of hardware with different performance, with which the characteristics of on-chip data using varies to fit the parallel structure. This mismatch also results in significant performance degradation for some maladaptive layers obtained from NAS, which might achieved a much better latency when the adopted dataflow changes. To address the issue that the network latency is insufficient to evaluate the deployment efficiency, this paper proposes a novel hardware-aware NAS framework in consideration of the adaptability between layers and dataflow patterns. Beside, we develop an optimized layer adaptive data scheduling strategy as well as a coarse-grained reconfigurable computing architecture so as to deploy the searched networks with high power-efficiency by selecting the most appropriate dataflow pattern layer-by-layer under limited resources. Evaluation results show that the proposed NAS framework can search DCNNs with the similar accuracy to the state-of-the-art ones as well as the low inference latency, and the proposed architecture provides both power-efficiency improvement and energy consumption saving.

References

[1]

Yu Hsin Chen, Tushar Krishna, Joel S. Emer, and Vivienne Sze. 2017. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127--138.

Crossref

Google Scholar

[2]

Yu Hsin Chen, Tien Ju Yang, Joel S. Emer, and Vivienne Sze. 2019. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9 (2019), 292--308.

Crossref

Google Scholar

[3]

Kartik Hegde, Jiyong Yu, Rohit Agrawal, Mengjia Yan, Michael Pellauer, and Christopher W. Fletcher. 2018. UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition. ISCA.

Google Scholar

[4]

Weiwen Jiang, Xinyi Zhang, Edwin H.M. Sha, Jingtong Hu, Lei Yang, Qingfeng Zhuge, Yiyu Shi, and Jingtong Hu. 2019. Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search. DAC (2019).

Digital Library

Google Scholar

[5]

Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. 2018. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects. ASPLOS (2018).

Digital Library

Google Scholar

[6]

Chuxi Li, Xiaoya Fan, Yuling Geng, Zhao Yang, Danghui Wang, and Meng Zhang. 2020. ENAS oriented layer adaptive data scheduling strategy for resource limited hardware. 381 (2020), 29--39. https://doi.org/10.1016/j.neucom.2019.11.005

Crossref

Google Scholar

[7]

Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2018. Progressive Neural Architecture Search. Computer Vision - ECCV (2018), 19--35.

Google Scholar

[8]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2019. DARTS: DIFFERENTIABLE ARCHITECTURE SEARCH. ICLR (2019).

Google Scholar

[9]

Yufei Ma, Yu Cao, Sarma Vrudhula, and Jae-sun Seo. 2017. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks. Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '17 (2017).

Digital Library

Google Scholar

[10]

Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. ICML (2018). Chuxi Li and Xiaoya Fan, et al.

Google Scholar

[11]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. 2019. MnasNet: Platform-Aware Neural Architecture Search for Mobile. CVPR (2019).

Google Scholar

[12]

Jian Weng, Sihao Liu, Zhengrong Wang, Vidushi Dadu, and Tony Nowatzki. 2020. A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms. HPCA (2020).

Google Scholar

[13]

Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. CVPR (2019).

Google Scholar

[14]

Barret Zoph and Quoc V Le. 2017. NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING. ICLR (2017).

Google Scholar

Cited By

View all

Li CFan XWu XYang ZWang MZhang MZhang S(2022)Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data ArrangementIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319749341:11(4112-4123)Online publication date: Nov-2022
https://doi.org/10.1109/TCAD.2022.3197493
Zhang HWu XDu YGuo HLi CYuan YZhang MZhang S(2021)A Heterogeneous RISC-V Processor for Efficient DNN Application in Smart Sensing SystemSensors10.3390/s2119649121:19(6491)Online publication date: 28-Sep-2021
https://doi.org/10.3390/s21196491

Index Terms

Hardware-Aware NAS Framework with Layer Adaptive Scheduling on Embedded System
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks

Recommendations

DCNN search and accelerator co-design: Improve the adaptability between NAS frameworks and embedded platforms
Abstract
The gap between Neural Architecture Search (NAS) and hardware embedded accelerators degrades the deployment efficiency, due to the absent of rethinking the applicability of the searched network layer characteristics and hardware mapping. ...
Highlights
- Bridge the applicability gap between NAS networks and hardware embedded accelerators.
- A novel hardware-aware NAS framework that incorporates a deduced efficiency metric.
- A layer adaptive scheduler that generates the optimal ...
Dynamic Partial Reconfigurable Embedded System to Achieve Hardware Flexibility Using 8051 Based RTOS on Xilinx FPGA
ACT '09: Proceedings of the 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies

Field Programmable Gate Arrays (FPGAs) are increasingly being used for many systems and efficient System-on-a-Chip (SOC) designs. Hence, dynamic partial self reconfiguration (DPSR) of the FPGA can be regarded as one of essentials of making hardware ...
Fingerprint image processing acceleration through run-time reconfigurable hardware

To the best of the authors' knowledge, this is the first brief that implements a complete automatic fingerprint-based authentication system (AFAS) application under a dynamically partial self-reconfigurable field-programmable gate array (FPGA). The main ...

Comments

Information & Contributors

Information

Published In

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

January 2021

930 pages

ISBN:9781450379991

DOI:10.1145/3394885

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

ASPDAC '21

Sponsor:

SIGDA

ASPDAC '21: 26th Asia and South Pacific Design Automation Conference

January 18 - 21, 2021

Tokyo, Japan

Acceptance Rates

ASPDAC '21 Paper Acceptance Rate 111 of 368 submissions, 30%;

Overall Acceptance Rate 466 of 1,454 submissions, 32%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
168
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li CFan XWu XYang ZWang MZhang MZhang S(2022)Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data ArrangementIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319749341:11(4112-4123)Online publication date: Nov-2022
https://doi.org/10.1109/TCAD.2022.3197493
Zhang HWu XDu YGuo HLi CYuan YZhang MZhang S(2021)A Heterogeneous RISC-V Processor for Efficient DNN Application in Smart Sensing SystemSensors10.3390/s2119649121:19(6491)Online publication date: 28-Sep-2021
https://doi.org/10.3390/s21196491

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

DCNN search and accelerator co-design: Improve the adaptability between NAS frameworks and embedded platforms

Dynamic Partial Reconfigurable Embedded System to Achieve Hardware Flexibility Using 8051 Based RTOS on Xilinx FPGA

Fingerprint image processing acceleration through run-time reconfigurable hardware

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations