research-article

Real-Time Meets Approximate Computing: An Elastic CNN Inference Accelerator with Adaptive Trade-off between QoS and QoR

Authors:

Ying Wang,

Huawei Li,

Xiaowei LiAuthors Info & Claims

DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017

Article No.: 33, Pages 1 - 6

https://doi.org/10.1145/3061639.3062307

Published: 18 June 2017 Publication History

Get Access

Abstract

Due to the recent progress in deep learning and neural acceleration architectures, specialized deep neural network or convolutional neural network (CNNs) accelerators are expected to provide an energy-efficient solution for real-time vision/speech processing. recognition and a wide spectrum of approximate computing applications. In addition to their wide applicability scope, we also found that the fascinating feature of deterministic performance and high energy-efficiency, makes such deep learning (DL) accelerators ideal candidates as application-processor IPs in embedded SoCs concerned with real-time processing. However, unlike traditional accelerator designs, DL accelerators introduce a new aspect of design trade-off between real-time processing (QoS) and computation approximation (QoR) into embedded systems. This work proposes an elastic CNN acceleration architecture that automatically adapts to the hard QoS constraint by exploiting the error-resilience in typical approximate computing workloads For the first time, the proposed design, including network tuning-and-mapping software and reconfigurable accelerator hardware, aims to reconcile the design constraint of QoS and Quality of Result (QoR). which are respectively the key concerns in real-time and approximate computing. It is shown in experiments that the proposed architecture enables the embedded system to work flexibly in an expanded operating space, significantly enhances its real-time ability. and maximizes the energy-efficiency of system within the user-specified QoS-QoR constraint through self-reconfiguration.

References

[1]

S. T. Chakradhar and et al., "Best-effort computing: re-thinking parallel software and hardware," in Proc. DAC, 2010.

Digital Library

Google Scholar

[2]

O. Russakovsky and et al., "ImageNet Large Scale Visual Recognition Challenge," in International Journal of Computer Vision, 2015.

Digital Library

Google Scholar

[3]

T. Chen and et al. "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," in Proc. ASPLOS, 2014.

Digital Library

Google Scholar

[4]

H. Esmaeilzadeh and et al., "Neural acceleration for general-purpose approximate programs," in Proc. MICRO, 2012.

Digital Library

Google Scholar

[5]

K. Chatfield and et al., "Return of the Devil in the Details: Delving Deep into Convolutional Nets," in Proc. BMVC, 2014.

Crossref

Google Scholar

[6]

S. Han, et al., "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding," arXiv preprint. 2015.

Google Scholar

[7]

A. Krizhevsky, and et al., "ImageNet Classification with Deep Convolutional Neural Networks," in Proc. NIPS, 2012.

Digital Library

Google Scholar

[8]

L. Song et. al., "C-Brain:A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization," in Proc. DAC, 2016.

Digital Library

Google Scholar

[9]

A. Yazdanbakhsh, and et. al., "Axbench: A Multi-Platform Benchmark Suite for Approximate Computing", in IEEE Design and Test, 2016.

Google Scholar

Cited By

View all

Sun PYu HHalak BKazmierski T(2024)A Method for Swift Selection of Appropriate Approximate Multipliers for CNN Hardware Accelerators2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558159(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10558159
Napoli EZacharelos EStrollo ADi Meo G(2024)Approximate Full-Adders: A Comprehensive AnalysisIEEE Access10.1109/ACCESS.2024.346318212(136054-136072)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3463182
Damsgaard HOmetov ANurmi J(2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
https://dl.acm.org/doi/10.1145/3572772
Show More Cited By

Recommendations

Preemption of the Partial Reconfiguration Process to Enable Real-Time Computing With FPGAs

To improve computing performance in real-time applications, modern embedded platforms comprise hardware accelerators that speed up the task’s most compute-intensive parts. A recent trend in the design of real-time embedded systems is to integrate field-...
A reconfigurable computing platform for real time embedded applications

Reconfigurable computing is a promising technique for real time computing-intensive embedded applications. In this paper, we propose a novel hardware task model and an optimal 2D surface partitioning strategy for managing a partially run time ...
Real-time embedded systems powered by FPGA dynamic partial self-reconfiguration: a case study oriented to biometric recognition applications

This work aims to pave the way for an efficient open system architecture applied to embedded electronic applications to manage the processing of computationally complex algorithms at real-time and low-cost. The target is to define a standard ...

Comments

Information & Contributors

Information

Published In

DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017

June 2017

533 pages

ISBN:9781450349277

DOI:10.1145/3061639

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

DAC '17

Sponsor:

EDAC
SIGDA

DAC '17: The 54th Annual Design Automation Conference 2017

June 18 - 22, 2017

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,317 of 3,929 submissions, 34%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
636
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)16

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Sun PYu HHalak BKazmierski T(2024)A Method for Swift Selection of Appropriate Approximate Multipliers for CNN Hardware Accelerators2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558159(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10558159
Napoli EZacharelos EStrollo ADi Meo G(2024)Approximate Full-Adders: A Comprehensive AnalysisIEEE Access10.1109/ACCESS.2024.346318212(136054-136072)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3463182
Damsgaard HOmetov ANurmi J(2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
https://dl.acm.org/doi/10.1145/3572772
Zhao XWang YLiu CShi CTu KZhang L(2023)Network Pruning for Bit-Serial AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.320395542:5(1597-1609)Online publication date: May-2023
https://doi.org/10.1109/TCAD.2022.3203955
Wang YHe YCheng LLi HLi X(2022)A Fast Precision Tuning Solution for Always-On DNN AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.308966741:5(1236-1248)Online publication date: May-2022
https://doi.org/10.1109/TCAD.2021.3089667
Neda NUllah SGhanbari AMahdiani HModarressi MKumar AWang T(2022)Multi-Precision Deep Neural Network Acceleration on FPGAsProceedings of the 27th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC52403.2022.9712485(454-459)Online publication date: 17-Jan-2022
https://dl.acm.org/doi/10.1109/ASP-DAC52403.2022.9712485
Avian CMahali MPutro NPrakosa SLeu J(2022)Fx-Net and PureNetComputers in Biology and Medicine10.1016/j.compbiomed.2022.105913148:COnline publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.compbiomed.2022.105913
Maleki MNabipour-Meybodi AKamal MAfzali-Kusha APedram M(2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1145/3460972
Xu DZhu ZLiu CWang YZhao SZhang LLiang HLi HCheng K(2021)Reliability Evaluation and Analysis of FPGA-Based Neural Network Acceleration SystemIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2020.304607529:3(472-484)Online publication date: Mar-2021
https://doi.org/10.1109/TVLSI.2020.3046075
Sousa L(2021)Nonconventional Computer Arithmetic Circuits, Systems and ApplicationsIEEE Circuits and Systems Magazine10.1109/MCAS.2020.302742521:1(6-40)Online publication date: Sep-2022
https://doi.org/10.1109/MCAS.2020.3027425
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

Preemption of the Partial Reconfiguration Process to Enable Real-Time Computing With FPGAs

A reconfigurable computing platform for real time embedded applications

Real-time embedded systems powered by FPGA dynamic partial self-reconfiguration: a case study oriented to biometric recognition applications

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations