short-paper

Cross-layer approaches for improving the dependability of deep learning systems

Authors:
Muhammad Abdullah Hanif

Technische Universität Wien (TU Wien), Vienna, Austria

Technische Universität Wien (TU Wien), Vienna, Austria
View Profile

,
Le-Ha Hoang

Technische Universität Wien (TU Wien), Vienna, Austria

Technische Universität Wien (TU Wien), Vienna, Austria
View Profile

,
Muhammad Shafique

Technische Universität Wien (TU Wien), Vienna, Austria

Technische Universität Wien (TU Wien), Vienna, Austria
View Profile

SCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded SystemsMay 2020Pages 78–81https://doi.org/10.1145/3378678.3391884

Published:25 May 2020Publication History

SCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems

Pages 78–81

ABSTRACT

Deep Neural Networks (DNNs) - the state-of-the-art computational models for many Artificial Intelligence (AI) applications - are inherently compute and resource-intensive and, hence, cannot exploit traditional redundancy-based fault mitigation techniques for enhancing the dependability of DNN-based systems. Therefore, there is a dire need to search for alternate methods that can improve their reliability without high expenditure of resources by exploiting the intrinsic characteristics of these networks. In this paper, we present cross-layer approaches that, based on the intrinsic characteristics of DNNs, employ software and hardware-level modifications for improving the resilience of DNN-based systems to hardware-level faults, e.g., soft errors and permanent faults.

References

M. Al-Qizwini et al. 2017. Deep learning algorithm for autonomous driving using GoogLeNet. In IEEE IV Symposium. 89--96.Google Scholar
A. Azizimazreah et al. 2018. Tolerating soft errors in deep learning accelerators with reliable on-chip memory designs. In IEEE NAS. 1--10.Google Scholar
R. C. Baumann. 2005. Radiation-induced soft errors in advanced semiconductor technologies. IEE T-DMR 5, 3 (2005), 305--316.Google Scholar
Y. Chen et al. 2019. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (2019).Google Scholar
Z. Chen et al. 2019. BinFI: an efficient fault injector for safety-critical machine learning systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 69.Google ScholarDigital Library
Z. Chen et al. 2020. Ranger: Boosting Error Resilience of Deep Neural Networks through Range Restriction. arXiv preprint arXiv.2003.13874 (2020).Google Scholar
L-C Chu et al. 1990. Fault tolerant neural networks with hybrid redundancy. In IEEE IJCNN. IEEE, 639--649.Google Scholar
A. Esteva et al. 2019. A guide to deep learning in healthcare. Nature medicine 25, 1 (2019), 24.Google Scholar
M. Shafique et al. 2013. Exploiting program-level masking and error propagation for constrained reliability optimization. In Proceedings of the 50th Annual Design Automation Conference. 1--9.Google ScholarDigital Library
H. I. Fawaz et al. 2019. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery 33, 4 (2019), 917--963.Google ScholarDigital Library
J. Guo et al. 2014. Novel low-power and highly reliable radiation hardened memory cell for 65 nm CMOS technology. IEEE TCAS-I 61, 7 (2014), 1994--2001.Google Scholar
Y. Guo et al. 2016. Deep learning for visual understanding: A review. Elsevier Neurocomputing 187 (2016), 27--48.Google ScholarDigital Library
M. Hanif at al. 2020. SalvageDNN: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping. Philosophical Transactions of the Royal Society A 378, 2164 (2020), 20190164.Google Scholar
M. A. Hanif et al. 2018. Robust Machine Learning Systems: Reliability and Security for Deep Neural Networks. In IEEE IOLTS. 257--260.Google Scholar
L. Hoang et al. 2019. FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation. arXiv preprint arXiv:1912.00941 (2019).Google Scholar
B. Huval et al. 2015. An empirical evaluation of deep learning on highway driving. arXiv preprint arXiv:1504.01716 (2015).Google Scholar
N. P. Jouppi et al. 2017. In-datacenter performance analysis of a tensor processing unit. In ACM/IEEE ISCA. 1--12.Google Scholar
K. Kang et al. 2008. NBTI induced performance degradation in logic and memory circuits: How effectively can we approach a reliability solution?. In ACM/IEEE ASP-DAC. 726--731.Google Scholar
S. Kim et al. 2018. Energy-efficient neural network acceleration in the presence of bit-level memory errors. IEEE TCAS-I 65, 12 (2018), 4285--4298.Google Scholar
H. Kwon et al. 2018. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. In ACM ASPLOS. 461--475.Google ScholarDigital Library
Y. LeCun et al. 2015. Deep learning. Nature 521, 7553 (2015), 436.Google Scholar
R. E. Lyons et al. 1962. The use of triple-modular redundancy to improve computer reliability. IBM journal of research and development 6, 2 (1962), 200--209.Google Scholar
A. Marchisio et al. 2019. Deep Learning for Edge Computing: Current Trends, Cross-Layer Optimizations, and Open Research Challenges. In IEEE ISVLSI. 553--559.Google Scholar
R. Miotto et al. 2018. Deep learning for healthcare: review, opportunities and challenges. Briefings in bioinformatics 19, 6 (2018), 1236--1246.Google Scholar
S. Mozaffari et al. 2019. Deep Learning-based Vehicle Behaviour Prediction For Autonomous Driving Applications: A Review. arXiv preprint arXiv:1912.11676 (2019).Google Scholar
M. Naseer et al. 2019. FANNet: Formal Analysis of Noise Tolerance, Training Bias and Input Sensitivity in Neural Networks. arXiv preprint arXiv:1912.01978 (2019).Google Scholar
L. Palazzi et al. 2020. Improving the Accuracy of IR-level Fault Injection. IEEE TDSC (2020).Google Scholar
B. S. Prabakaran et al. 2020. EMAP: A Cloud-Edge Hybrid Framework for EEG Monitoring and Cross-Correlation Based Real-time Anomaly Prediction. arXiv preprint arXiv:2004.10491 (2020).Google Scholar
B. Reagen et al. 2018. Ares: A Framework for Quantifying the Resilience of Deep Neural Networks. In ACM/IEEE DAC. 17:1--17:6.Google Scholar
S. Rehman et al. 2016. Reliable Software for Unreliable Hardware: A Cross Layer Perspective. Springer.Google Scholar
M. Shafique et al. 2014. The EDA challenges in the dark silicon era: Temperature, reliability, and variability perspectives. In ACM/IEEE DAC. 1--6.Google Scholar
M. Shafique et al. 2018. An overview of next-generation architectures for machine learning: Roadmap, opportunities and challenges in the IoT era. In IEEE DATE. 827--832.Google Scholar
M. Shafique et al. 2020. Robust Machine Learning Systems: Challenges, Current Trends, Perspectives, and the Road Ahead. IEEE D&T 37, 2 (2020), 30--57.Google ScholarCross Ref
S. Shankland. [n.d.]. Meet Tesla's self-driving car computer and its two AI brains. https://www.cnet.com/news/meet-tesla-self-driving-car-computer-and-its-two-ai-brains/.Google Scholar
V. Sze et al. 2017. Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 105, 12 (2017), 2295--2329.Google ScholarCross Ref
R. Vadlamani et al. 2010. Multicore soft error rate stabilization using adaptive dual modular redundancy. In IEEE DATE. 27--32.Google Scholar
X. Vera et al. 2010. Selective replication: A lightweight technique for soft errors. ACM TOCS 27, 4 (2010), 1--30.Google ScholarDigital Library
J. Zhang et al. 2018. ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Neural Network Accelerators. arXiv preprint arXiv:1802.03806 (2018).Google Scholar
J. J Zhang et al. 2018. Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator. In IEEE VTS. IEEE, 1--6.Google Scholar
K. Zhao et al. 2020. Algorithm-Based Fault Tolerance for Convolutional Neural Networks. arXiv preprint arXiv:2003.12203 (2020).Google Scholar

Recommendations

Understanding error propagation in deep learning neural network (DNN) accelerators and applications
SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Deep learning neural networks (DNNs) have been successful in solving a wide range of machine learning problems. Specialized hardware accelerators have been proposed to accelerate the execution of DNN algorithms for high-performance and energy ...
Read More
A cross-layer approach towards developing efficient embedded Deep Learning systems
Abstract
With the evolution of Smart Cyber–Physical Systems (CPS) and Internet-of-Things (IoT), the number of connected (intelligent) devices is increasing at an exponential rate, and so as the data being produced by them. To process this ...
Read More
Dependability Analysis of Fault Tolerant Systems Based on Partial Dynamic Reconfiguration Implemented into FPGA
DSD '12: Proceedings of the 2012 15th Euromicro Conference on Digital System Design

In this paper, a dependability analysis of fault tolerant systems implemented into the SRAM-based FPGA is presented. The fault tolerant architectures are based on the redundancy of functional units associated with a concurrent error detection technique ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems
May 2020
96 pages
ISBN:9781450371315
DOI:10.1145/3378678
Editor:
Sander Stuijk
Eindhoven University of Technology, The Netherlands
,
General Chair:
Henk Corporaal
Eindhoven University of Technology, NL
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 May 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
DNNs
cross-layer
deep learning
dependability
faults
hardware accelerators
neural networks
reliability
soft errors
systems
Qualifiers
- short-paper
Conference

Acceptance Rates
SCOPES '20 Paper Acceptance Rate8of13submissions,62%Overall Acceptance Rate38of79submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 272
  Total Downloads
- Downloads (Last 12 months)29
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cross-layer approaches for improving the dependability of deep learning systems

SCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems

ABSTRACT

References

Cited By

Recommendations

Understanding error propagation in deep learning neural network (DNN) accelerators and applications

A cross-layer approach towards developing efficient embedded Deep Learning systems

Dependability Analysis of Fault Tolerant Systems Based on Partial Dynamic Reconfiguration Implemented into FPGA