skip to main content
10.1145/3508352.3561120acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
invited-talk
Public Access

Fault-Tolerant Deep Learning Using Regularization

Published: 22 December 2022 Publication History

Abstract

Resistive random-access memory has become one of the most popular choices of hardware implementation for machine learning application workloads. However, these devices exhibit non-ideal behavior, which presents a challenge towards widespread adoption. Training/inferencing on these faulty devices can lead to poor prediction accuracy. However, existing fault tolerant methods are associated with high implementation overheads. In this paper, we present some new directions for solving reliability issues using software solutions. These software-based methods are inherent in deep learning training/inferencing, and they can also be used to address hardware reliability issues as well. These methods prevent accuracy drop during training/inferencing due to unreliable ReRAMs and are associated with lower area and power overheads.

References

[1]
California consumer privacy act home page. https://www.caprivacy.org/. Online; accessed 14/02/2021
[2]
B. McMahan and D. Ramage, "Federated learning: Collaborative machine learning without centralized training data," Google Research Blog, vol. 3, 2017
[3]
L. Song, X. Qian, H. Li and Y. Chen, "PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017, pp. 541--552
[4]
A. Shafiee et al., "ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars," 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 14--26
[5]
B. K. Joardar, J. R. Doppa, P. P. Pande, H. Li and K. Chakrabarty, "AccuReD: High Accuracy Training of CNNs on ReRAM/GPU Heterogeneous 3-D Architecture," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 40, no. 5, pp. 971--984, May 2021
[6]
A. Chaudhuri and K. Chakrabarty, "Analysis of Process Variations, Defects, and Design-Induced Coupling in Memristors," IEEE International Test Conference (ITC), Phoenix, USA, 2018, pp. 1--10
[7]
Z. He, J. Lin, R. Ewetz, J. Yuan and D. Fan, "Noise Injection Adaption: End-to-End ReRAM Crossbar Non-ideal Effect Adaption for Neural Network Mapping," 2019 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, NV, USA, 2019, pp. 1--6
[8]
B. K. Joardar, J. R. Doppa, H. Li, K. Chakrabarty, and P. P. Pande, "Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators," in ACM Transactions on Embedded Computing Systems (TECS), 20, 5s, Article 55, 2021.
[9]
B. Feinberg, S. Wang and E. Ipek, "Making Memristive Neural Network Accelerators Reliable," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, 2018, pp. 52--65
[10]
L. Xia, M. Liu, X. Ning, K. Chakrabarty and Y. Wang, "Fault-Tolerant Training Enabled by On-Line Fault Detection for RRAM-Based Neural Computing Systems," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38, no. 9, pp. 1611--1624, Sept. 2019
[11]
C. Liu, M. Hu, J. P. Strachan, and H. Li, "Rescuing Memristor-based Neuromorphic Design with High Defects," In Proceedings of the 54th Annual Design Automation Conference 2017 (DAC '17). Association for Computing Machinery, New York, NY, USA, Article 87, 1--6
[12]
C. Lee, H. Lin, C. Lien, Y. Chih and J. Chang, "A 1.4Mb 40-nm embedded ReRAM macro with 0.07um2 bit cell, 2.7mA/100MHz low-power read and hybrid write verify for high endurance application," 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC), Seoul, Korea (South), 2017, pp. 9--12
[13]
B. Zhang, N. Uysal, D. Fan, and R. Ewetz. 2020. Redundant Neurons and Shared Redundant Synapses for Robust Memristor-based DNNs with Reduced Overhead. In Proceedings of the 2020 on Great Lakes Symposium on VLSI (GLSVLSI '20). Association for Computing Machinery, New York, NY, USA, 339--344
[14]
E. Esmanhotto et al., "High-Density 3D Monolithically Integrated Multiple 1T1R Multi-Level-Cell for Neural Networks," 2020 IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2020, pp. 36.5.1--36.5.4
[15]
C. Walczyk et al., "Impact of Temperature on the Resistive Switching Behavior of Embedded HfO2-Based RRAM Devices," IEEE Trans. Electron Devices, vol. 58, no. 9, 2011
[16]
A. I. Arka, et. al., "Performance and Accuracy Tradeoffs for Training Graph Neural Networks on ReRAM-Based Architectures," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 10, pp. 1743--1756, Oct. 2021.
[17]
Q. Lou, et. al., "Embedding error correction into crossbars for reliable matrix vector multiplication using emerging devices," In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED '20). Association for Computing Machinery, New York, NY, USA, 139--144
[18]
B. K. Joardar, et. al., "High-Throughput Training of Deep CNNs on ReRAM-Based Heterogeneous Architectures via Optimized Normalization Layers," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 41, no. 5, pp. 1537--1549, May 2022.
[19]
X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks", in AISTATS, pp. 249--256, 2010
[20]
Paul Merolla, Rathinakumar Appuswamy, John Arthur, Steve K. Esser, Dharmendra Modha, 2016, Deep neural networks are robust to weight binarization and other non-linear distortions, in arXiv:1606.01981, 2016
[21]
A. I. Arka, B. K. Joardar, J. R. Doppa, P. P. Pande, and K. Chakrabarty, "ReGraphX: NoC-enabled 3D Heterogeneous ReRAM Architecture for Training Graph Neural Networks," 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2021, pp. 1667--1672
[22]
Y. Long, T. Na and S. Mukhopadhyay, "ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 12, pp. 2781--2794, Dec. 2018
[23]
X. Yang, B. Yan, H. Li and Y. Chen, "ReTransformer: ReRAM-based Processing-in-Memory Architecture for Transformer Acceleration," 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2020, pp. 1--9.
[24]
P. Luo, X. Wang, W. Shao, Z. Peng, "Towards Understanding Regularization in Batch Normalization," in ICLR 2019

Index Terms

  1. Fault-Tolerant Deep Learning Using Regularization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
    October 2022
    1467 pages
    ISBN:9781450392174
    DOI:10.1145/3508352
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    In-Cooperation

    • IEEE-EDS: Electronic Devices Society
    • IEEE CAS
    • IEEE CEDA

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 December 2022

    Check for updates

    Author Tags

    1. ReRAM
    2. deep learning
    3. regularization
    4. reliability

    Qualifiers

    • Invited-talk

    Funding Sources

    Conference

    ICCAD '22
    Sponsor:
    ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design
    October 30 - November 3, 2022
    California, San Diego

    Acceptance Rates

    Overall Acceptance Rate 457 of 1,762 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 239
      Total Downloads
    • Downloads (Last 12 months)113
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media