research-article

A Fault-Tolerant Neural Network Architecture

Authors:

Gang QuanAuthors Info & Claims

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019

Article No.: 55, Pages 1 - 6

https://doi.org/10.1145/3316781.3317742

Published: 02 June 2019 Publication History

Abstract

New DNN accelerators based on emerging technologies, such as resistive random access memory (ReRAM), are gaining increasing research attention given their potential of "in-situ" data processing. Unfortunately, device-level physical limitations that are unique to these technologies may cause weight disturbance in memory and thus compromising the performance and stability of DNN accelerators. In this work, we propose a novel fault-tolerant neural network architecture to mitigate the weight disturbance problem without involving expensive retraining. Specifically, we propose a novel collaborative logistic classifier to enhance the DNN stability by redesigning the binary classifiers augmented from both traditional error correction output code (ECOC) and modern DNN training algorithm. We also develop an optimized variable-length "decode-free" scheme to further boost the accuracy under fewer number of classifiers. Experimental results on cutting-edge DNN models and complex datasets show that the proposed fault-tolerant neural network architecture can effectively rectify the accuracy degradation against weight disturbance for DNN accelerators with low cost, thus allowing for its deployment in a variety of mainstream DNNs.

References

[1]

Hiroyuki Akinaga and Hisashi Shima. 2010. Resistive random access memory (ReRAM) based on metal oxides. Proc. IEEE 98, 12 (2010), 2237--2251.

[2]

Fabien Alibart, Ligang Gao, Brian D Hoskins, and Dmitri B Strukov. 2012. High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm. Nanotechnology 23, 7 (2012), 075201.

[3]

Adam Berger. 1999. Error-correcting output coding for text classification. In IJCAI-99: Workshop on machine learning for information filtering.

[4]

Ting Chang, Sung-Hyun Jo, and Wei Lu. 2011. Short-term memory to long-term memory transition in a nanoscale memristor. ACS nano 5, 9 (2011), 7669--7676.

[5]

Lerong Chen, Jiawen Li, Yiran Chen, Qiuping Deng, Jiyuan Shen, Xiaoyao Liang, and Li Jiang. 2017. Accelerator-friendly neural-network training: learning variations and defects in RRAM crossbar. In Proceedings of the Conference on Design, Automation & Test in Europe. European Design and Automation Association, 19--24.

Digital Library

[6]

Yunji Chen, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. 2016. DianNao family: energy-efficient hardware accelerators for machine learning. Commun. ACM 59, 11 (2016), 105--112.

Digital Library

[7]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.

[8]

Thomas G Dietterich and Ghulum Bakiri. 1995. Solving multiclass learning problems via error-correcting output codes. Journal of artificial intelligence research 2 (1995), 263--286.

[9]

Ben Feinberg, Shibo Wang, and Engin Ipek. 2018. Making memristive neural network accelerators reliable. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 52--65.

[10]

Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[11]

Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016).

[12]

Yan-huang Jiang, Qiang-li Zhao, and Xue-jun Yang. 2004. A general coding method for error-correcting output codes. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 648--652.

[13]

Yann LeCun, John S Denker, and Sara A Solla. 1990. Optimal brain damage. In Advances in neural information processing systems. 598--605.

Digital Library

[14]

Chenchen Liu, Miao Hu, John Paul Strachan, and Hai Li. 2017. Rescuing memristor-based neuromorphic design with high defects. In Design Automation Conference (DAC), 2017 54th ACM/EDAC/IEEE. IEEE, 1--6.

Digital Library

[15]

Tao Liu, Lei Jiang, Yier Jin, Gang Quan, and Wujie Wen. 2018. PT-spike: a precise-time-dependent single spike neuromorphic architecture with efficient supervised learning. In Proceedings of the 23rd Asia and South Pacific Design Automation Conference. IEEE Press, 568--573.

Digital Library

[16]

Tao Liu, Zihao Liu, Fuhong Lin, Yier Jin, Gang Quan, and Wujie Wen. 2017. MT-spike: a multilayer time-based spiking neuromorphic architecture with temporal error backpropagation. In Proceedings of the 36th International Conference on Computer-Aided Design. IEEE Press, 450--457.

Digital Library

[17]

Gilberto Medeiros-Ribeiro, Frederick Perner, Richard Carter, Hisham Abdalla, Matthew D Pickett, and R Stanley Williams. 2011. Lognormal switching times for titanium dioxide bipolar memristors: origin and resolution. Nanotechnology 22, 9 (2011), 095702.

[18]

Ali Shafiee, Anirban Nag, Naveen Muralimanohar, Rajeev Balasubramonian, John Paul Strachan, Miao Hu, R Stanley Williams, and Vivek Srikumar. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Computer Architecture News 44, 3 (2016), 14--26.

Digital Library

[19]

David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.

[20]

Patrice Y Simard, Dave Steinkraus, and John C Platt. 2003. Best practices for convolutional neural networks applied to visual document analysis. In null. IEEE, 958.

Digital Library

[21]

Linghao Song, Xuehai Qian, Hai Li, and Yiran Chen. 2017. PipeLayer: A pipelined ReRAM-based accelerator for deep learning. In High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 541--552.

[22]

Christian Szegedy. 2016. An Overview of Deep Learning. AITP 2016 (2016).

[23]

Bonan Yan, Jianhua Joshua Yang, Qing Wu, Yiran Chen, and Hai Helen Li. 2017. A closed-loop design to enhance weight stability of memristor based neural network chips. In Proceedings of the 36th International Conference on Computer-Aided Design. IEEE Press, 541--548.

Digital Library

[24]

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in neural information processing systems. 3320--3328.

Digital Library

[25]

Bin Zhao and Eric P Xing. 2016. Sparse output coding for scalable visual recognition. International Journal of Computer Vision 119, 1 (2016), 60--75.

Digital Library

Cited By

Yousuf OHoskins BRamu KFream MBorders WMadhavan ADaniels MDienstfrey AMcClelland JLueker-Boden MAdam G(2025)Layer ensemble averaging for fault tolerance in memristive neural networksNature Communications10.1038/s41467-025-56319-616:1Online publication date: 1-Feb-2025
https://doi.org/10.1038/s41467-025-56319-6
Eldebiky AZhang GBöcherer GLi BSchlichtmann U(2024)CorrectNet+: Dealing With HW Non-Idealities in In-Memory-Computing Platforms by Error Suppression and CompensationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.331308943:2(573-585)Online publication date: Feb-2024
https://doi.org/10.1109/TCAD.2023.3313089
Liu HHuang CSun KYin JWu XWang JZhang QZheng YNigam VLiu FSifakis J(2024)Design for dependability — State of the art and trendsJournal of Systems and Software10.1016/j.jss.2024.111989(111989)Online publication date: Feb-2024
https://doi.org/10.1016/j.jss.2024.111989
Show More Cited By

A Fault-Tolerant Neural Network Architecture
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Fault-tolerant neural network classifiers
Bipolar vector classifier for fault-tolerant deep neural networks
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Deep Neural Networks (DNNs) surpass the human-level performance on specific tasks. The outperforming capability accelerate an adoption of DNNs to safety-critical applications such as autonomous vehicles and medical diagnosis. Millions of parameters in ...
TextConvoNet: a convolutional neural network based architecture for text classification
Abstract
This paper presents, TextConvoNet, a novel Convolutional Neural Network (CNN) based architecture for binary and multi-class text classification problems. Most of the existing CNN-based models use one-dimensional convolving filters, where each ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019

June 2019

1378 pages

ISBN:9781450367257

DOI:10.1145/3316781

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE-CEDA

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

DAC '19

Sponsor:

SIGDA

DAC '19: The 56th Annual Design Automation Conference 2019

June 2 - 6, 2019

NV, Las Vegas, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

47
Total Citations
View Citations
969
Total Downloads

Downloads (Last 12 months)104
Downloads (Last 6 weeks)8

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yousuf OHoskins BRamu KFream MBorders WMadhavan ADaniels MDienstfrey AMcClelland JLueker-Boden MAdam G(2025)Layer ensemble averaging for fault tolerance in memristive neural networksNature Communications10.1038/s41467-025-56319-616:1Online publication date: 1-Feb-2025
https://doi.org/10.1038/s41467-025-56319-6
Eldebiky AZhang GBöcherer GLi BSchlichtmann U(2024)CorrectNet+: Dealing With HW Non-Idealities in In-Memory-Computing Platforms by Error Suppression and CompensationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.331308943:2(573-585)Online publication date: Feb-2024
https://doi.org/10.1109/TCAD.2023.3313089
Liu HHuang CSun KYin JWu XWang JZhang QZheng YNigam VLiu FSifakis J(2024)Design for dependability — State of the art and trendsJournal of Systems and Software10.1016/j.jss.2024.111989(111989)Online publication date: Feb-2024
https://doi.org/10.1016/j.jss.2024.111989
Li BZhong DChen XLiu C(2023)Enabling Neuromorphic Computing for Artificial Intelligence with Hardware-Software Co-DesignNeuromorphic Computing10.5772/intechopen.111963Online publication date: 15-Nov-2023
https://doi.org/10.5772/intechopen.111963
Yu ALyu NYin JYan ZWen WKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)COLAProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620092(40277-40289)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620092
Gao DXie XWei D(2023)A Design Methodology for Fault-Tolerant Neuromorphic Computing Using Bayesian Neural NetworkMicromachines10.3390/mi1410184014:10(1840)Online publication date: 27-Sep-2023
https://doi.org/10.3390/mi14101840
Machado PManich SGómez-Pau ÁRodríguez-Montañés RGonzález MCampabadal FArumí D(2023)Programming Techniques of Resistive Random-Access Memory Devices for Neuromorphic ComputingElectronics10.3390/electronics1223480312:23(4803)Online publication date: 27-Nov-2023
https://doi.org/10.3390/electronics12234803
Moskalenko VKharchenko VMoskalenko AKuzikov B(2023)Resilience and Resilient Systems of Artificial Intelligence: Taxonomy, Models and MethodsAlgorithms10.3390/a1603016516:3(165)Online publication date: 18-Mar-2023
https://doi.org/10.3390/a16030165
Song LChen FLi HChen YMohror KArnold DBadia R(2023)ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear SolversProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607077(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607077
Lin YTang JGao BZhang QQian HWu H(2023)BETTER: Bayesian-Based Training and Lightweight Transfer Architecture for Reliable and High-Speed Memristor Neural Network DeploymentIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2022.323147170:6(1846-1850)Online publication date: Jun-2023
https://doi.org/10.1109/TCSII.2022.3231471
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten