research-article

Rescuing ReRAM-based Neural Computing Systems from Device Variation

Authors:

Chenglong Huang,

Yanting ChenAuthors Info & Claims

ACM Transactions on Design Automation of Electronic Systems, Volume 28, Issue 1

Article No.: 6, Pages 1 - 17

https://doi.org/10.1145/3533706

Published: 10 December 2022 Publication History

Abstract

Resistive random-access memory (ReRAM)-based crossbar array (RCA) is a promising platform to accelerate vector-matrix multiplication in deep neural networks (DNNs). There are, however, some practical issues, especially device variation, that hinder the versatile development of ReRAM in neural computing systems. The device variations include device-to-device variation (DDV) and cycle-to-cycle variation (CCV) that deviate the devise resistance in the RCA from their target state. Such resistance deviation seriously degrades the inference accuracy of DNNs. To address this issue, we propose a software-hardware compensation solution that includes compensation training based on scale factors (CTSF) and variation-aware compensation training based on scale factors (VACTSF) to protect the ReRAM-based DNN accelerator against device variation. The scale factors in CTSF can be flexibly set for reducing accuracy loss due to device variation when the weights programmed into RCA are determined. For effectively handling CCV, the scale factors are introduced into the training process for obtaining variation-tolerant weights by leveraging the inherent self-healing ability of DNNs. Simulation results based on our method confirm that the accuracy losses due to device variation on LeNet-5, ResNet, and VGG16 with different datasets are less than 5% under a large device variation by CTSF. More robust weights for conquering CCV are also obtained by VACTSF. The simulation results present that our method is competitive in comparison to other variation-tolerant methods.

References

[1]

Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, 10 (2014), 1533–1545.

Digital Library

[2]

Fabien Alibart, Ligang Gao, Brian D. Hoskins, and Dmitri B. Strukov. 2012. High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm. Nanotechnology 23, 7 (2012), 075201.

[3]

Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).

[4]

Oren Boiman, Eli Shechtman, and Michal Irani. 2008. In defense of nearest-neighbor based image classification. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–8.

[5]

Gouranga Charan, Jubin Hazra, Karsten Beckmann, Xiaocong Du, Gokul Krishnan, Rajiv V. Joshi, Nathaniel C. Cady, and Yu Cao. 2020. Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. In 2020 57th ACM/IEEE Design Automation Conference (DAC’20). IEEE, 1–6.

[6]

Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGARCH Computer Architecture News 42, 1 (2014), 269–284.

Digital Library

[7]

Ping Chi, Shuangchen Li, Cong Xu, Tao Zhang, Jishen Zhao, Yongpan Liu, Yu Wang, and Yuan Xie. 2016. Prime: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. ACM SIGARCH Computer Architecture News 44, 3 (2016), 27–39.

Digital Library

[8]

R. Degraeve, A. Fantini, N. Raghavan, L. Goux, S. Clima, B. Govoreanu, A. Belmonte, D. Linten, and M. Jurczak. 2015. Causes and consequences of the stochastic aspect of filamentary RRAM. Microelectronic Engineering 147 (2015), 171–175.

Digital Library

[9]

Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645–6649.

[10]

Alessandro Grossi, E. Nowak, Cristian Zambelli, C. Pellissier, S. Bernasconi, G. Cibrario, K. El Hajjam, R. Crochemore, J. F. Nodin, Piero Olivo, and L. Perniola. 2016. Fundamental variability limits of filament-based RRAM. In 2016 IEEE International Electron Devices Meeting (IEDM’16). IEEE, 4.7.1–4.7.4. DOI:

[11]

Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems 28 (2015).

[12]

Chenglong Huang, Puguang Liu, and Liang Fang. 2021. MXQN: Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks. Applied Intelligence 51 (2021), 4561–4574.

Digital Library

[13]

Chenglong Huang, Nuo Xu, Keni Qiu, Yujie Zhu, Desheng Ma, and Liang Fang. 2021. Efficient and optimized methods for alleviating the impacts of IR-drop and fault in RRAM based neural computing systems. IEEE Journal of the Electron Devices Society 9 (2021), 645–652.

[14]

Wenqin Huangfu, Lixue Xia, Ming Cheng, Xiling Yin, Tianqi Tang, Boxun Li, Krishnendu Chakrabarty, Yuan Xie, Yu Wang, and Huazhong Yang. 2017. Computation-oriented fault-tolerance schemes for RRAM computing systems. In 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC’17). IEEE, 794–799.

Digital Library

[15]

V. G. Karpov and D. Niraula. 2017. Log-normal statistics in filamentary RRAM devices and related systems. IEEE Electron Device Letters 38, 9 (2017), 1240–1243.

[16]

Jung Hyun Lee, Jihun Yun, Sung Ju Hwang, and Eunho Yang. 2021. Cluster-promoting quantization with bit-drop for minimizing network quantization loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5370–5379.

[17]

Sugil Lee, Giju Jung, Mohammed E. Fouda, Jongeun Lee, Ahmed Eltawil, and Fadi Kurdahi. 2020. Learning to predict IR drop with effective training for ReRAM-based neural network hardware. In 2020 57th ACM/IEEE Design Automation Conference (DAC’20). IEEE, 1–6.

[18]

Seung Ryul Lee, Young-Bae Kim, Man Chang, Kyung Min Kim, Chang Bum Lee, Ji Hyun Hur, Gyeong-Su Park, Dongsoo Lee, Myoung-Jae Lee, Chang Jung Kim, Yoo U-In Chung, and Kinam Kim In-Kyeong. 2012. Multi-level switching of triple-layered TaOx RRAM with excellent reliability for storage class memory. In 2012 Symposium on VLSI Technology (VLSIT’12). IEEE, 71–72. DOI:

[19]

Yung-Chih Liang, Ching-ji Huang, and Wei-bin Yang. 2008. A 320-MHz 8bit\(\times\) 8bit pipelined multiplier in ultra-low supply voltage. In 2008 IEEE Asian Solid-State Circuits Conference. IEEE, 73–76.

[20]

Bohan Lin, Yachuan Pang, Bin Gao, Jianshi Tang, Dong Wu, Ting-Wei Chang, Wei-En Lin, Xiaoyu Sun, Shimeng Yu, Meng-Fan Chang, He Qian, and Huaqiang Wu. 2021. A highly reliable RRAM physically unclonable function utilizing post-process randomness source. IEEE Journal of Solid-State Circuits 56, 5 (2021), 1641–1650. DOI:

[21]

Beiye Liu, Hai Li, Yiran Chen, Xin Li, Tingwen Huang, Qing Wu, and Mark Barnell. 2014. Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems. In 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’14). IEEE, 63–70.

Digital Library

[22]

Beiye Liu, Hai Li, Yiran Chen, Xin Li, Qing Wu, and Tingwen Huang. 2015. Vortex: Variation-aware training for memristor X-bar. In Proceedings of the 52nd Annual Design Automation Conference. 1–6.

Digital Library

[23]

Junjie Liu, Dongchao Wen, Deyu Wang, Wei Tao, Tse-Wei Chen, Kinya Osa, and Masami Kato. 2020. QuantNet: Learning to quantize by learning within fully differentiable framework. In European Conference on Computer Vision. Springer, 38–53.

Digital Library

[24]

Yun Long, Xueyuan She, and Saibal Mukhopadhyay. 2019. Design of reliable DNN accelerator with un-reliable ReRAM. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE’19). IEEE, 1769–1774.

[25]

Chang Ma, Yanan Sun, Weikang Qian, Ziqi Meng, Rui Yang, and Li Jiang. 2020. Go unary: A novel synapse coding and mapping scheme for reliable ReRAM-based neuromorphic computing. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE’20). IEEE, 1432–1437.

[26]

Ziqi Meng, Weikanu Oian, Yilonz Zhao, Yanan Sun, Rui Yang, and Li Jiang. 2021. Digital offset for RRAM-based neuromorphic computing: A novel solution to conquer cycle-to-cycle variation. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE’21). IEEE, 1078–1083.

[27]

Xiaochen Peng, Shanshi Huang, Yandong Luo, Xiaoyu Sun, and Shimeng Yu. 2019. DNN+ NeuroSim: An end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies. In 2019 IEEE International Electron Devices Meeting (IEDM’19). IEEE, 32–5.

[28]

E. Pérez, Alessandro Grossi, Cristian Zambelli, Piero Olivo, R. Roelofs, and Ch. Wenger. 2016. Reduction of the cell-to-cell variability in Hf 1-x Al x O y based RRAM arrays by using program algorithms. IEEE Electron Device Letters 38, 2 (2016), 175–178.

[29]

Jeyavijayan Rajendran, Harika Maenm, Ramesh Karri, and Garrett S. Rose. 2011. An approach to tolerate process related variations in memristor-based applications. In 2011 24th International Conference on VLSI Design. IEEE, 18–23.

Digital Library

[30]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779–788.

[31]

Ali Shafiee, Anirban Nag, Naveen Muralimanohar, Rajeev Balasubramonian, John Paul Strachan, Miao Hu, R. Stanley Williams, and Vivek Srikumar. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Computer Architecture News 44, 3 (2016), 14–26.

Digital Library

[32]

Wonbo Shim, Jae-sun Seo, and Shimeng Yu. 2020. Two-step write–verify scheme and impact of the read noise in multilevel RRAM-based inference engine. Semiconductor Science and Technology 35, 11 (2020), 115026.

[33]

Linghao Song, Xuehai Qian, Hai Li, and Yiran Chen. 2017. Pipelayer: A pipelined ReRAM-based accelerator for deep learning. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 541–552.

[34]

Pierre Stock, Angela Fan, Benjamin Graham, Edouard Grave, Rémi Gribonval, Herve Jegou, and Armand Joulin. 2020. Training with quantization noise for extreme model compression. In International Conference on Learning Representations.

[35]

Dmitri B. Strukov, Gregory S. Snider, Duncan R. Stewart, and R. Stanley Williams. 2008. The missing memristor found. Nature 453, 7191 (2008), 80–83.

[36]

Shyam Anil Tailor, Javier Fernandez-Marques, and Nicholas Donald Lane. 2020. Degree-quant: Quantization-aware training for graph neural networks. In International Conference on Learning Representations.

[37]

Yuan Taur. 2002. CMOS design near the limit of scaling. IBM Journal of Research and Development 46, 2.3 (2002), 213–222.

Digital Library

[38]

Lixue Xia, Mengyun Liu, Xuefei Ning, Krishnendu Chakrabarty, and Yu Wang. 2017. Fault-tolerant training with on-line fault detection for RRAM-based neural computing systems. In Proceedings of the 54th Annual Design Automation Conference (2017), 1–6.

Digital Library

[39]

Nuo Xu, Taegyun Park, Kyung Jean Yoon, and Cheol Seong Hwang. 2021. In-memory stateful logic computing using memristors: Gate, calculation, and application. Physica Status Solidi (RRL)–Rapid Research Letters (2021), 2100208.

[40]

Nuo Xu, Tae Gyun Park, Hae Jin Kim, Xinglong Shao, Kyung Jean Yoon, Tae Hyung Park, Liang Fang, Kyung Min Kim, and Cheol Seong Hwang. 2020. A stateful logic family based on a new logic primitive circuit composed of two antiparallel bipolar memristors. Advanced Intelligent Systems 2, 1 (2020), 1900082.

[41]

Qi Xu, Song Chen, Hao Geng, Bo Yuan, Bei Yu, Feng Wu, and Zhengfeng Huang. 2020. Fault tolerance in memristive crossbar-based neuromorphic computing systems. Integration 70 (2020), 70–79.

Digital Library

[42]

Bonan Yan, Jianhua Yang, Qing Wu, Yiran Chen, and Hai Li. 2017. A closed-loop design to enhance weight stability of memristor based neural network chips. In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17). IEEE, 541–548.

Digital Library

[43]

Peng Yao, Huaqiang Wu, Bin Gao, Jianshi Tang, Qingtian Zhang, Wenqiang Zhang, J. Joshua Yang, and He Qian. 2020. Fully hardware-implemented memristor convolutional neural network. Nature 577, 7792 (2020), 641–646.

[44]

Shu-Chang Zhou, Yu-Zhi Wang, He Wen, Qin-Yao He, and Yu-Heng Zou. 2017. Balanced quantization: An effective and efficient approach to quantized neural networks. Journal of Computer Science and Technology 32, 4 (2017), 667–682.

[45]

Ying Zhu, Grace Li Zhang, Tianchen Wang, Bing Li, Yiyu Shi, Tsung-Yi Ho, and Ulf Schlichtmann. 2020. Statistical training for neuromorphic computing using memristor-based crossbars considering process variations and noise. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE’20). IEEE, 1590–1593.

[46]

Yujie Zhu, Xue Zhao, and Keni Qiu. 2020. Insights and optimizations on IR-drop induced sneak-path for RRAM crossbar-based convolutions. In 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC’20). IEEE, 506–511.

Digital Library

[47]

Gouranga Charan, Jubin Hazra, Karsten Beckmann, Xiaocong Du, Gokul Krishnan, Rajiv V. Joshi, Nathaniel C. Cady, and Yu Cao. 2020. Accurate inference with inaccurate RRAM devices: statistical data, model transfer, and on-line adaptation. In 57th ACM/IEEE Design Automation Conference (DAC’20). IEEE, 1–6.

Cited By

Xian JXing YCai SLi WXiong XHu Z(2024)WCPNet: Jointly Predicting Wirelength, Congestion and Power for FPGA Using Multi-Task LearningACM Transactions on Design Automation of Electronic Systems10.1145/365617029:3(1-19)Online publication date: 3-May-2024
https://dl.acm.org/doi/10.1145/3656170
Chen XChen HYang C(2024)PointCIM: A Computing-in-Memory Architecture for Accelerating Deep Point Cloud Analytics2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00097(1309-1322)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00097
Zhang CTang XPeng Y(2024)A Novel Database Acceleration Technology for Full Table ScansIEEE Access10.1109/ACCESS.2024.345210412(127532-127544)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3452104
Show More Cited By

Index Terms

Rescuing ReRAM-based Neural Computing Systems from Device Variation
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
  2. Robustness
    1. Design for manufacturability
      1. Process variations

Recommendations

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory
ISCA'16

Processing-in-memory (PIM) is a promising solution to address the "memory wall" challenges for future computer systems. Prior proposed PIM architectures put additional computation logic in or near memory. The emerging metal-oxide resistive random access ...
PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory
ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture

Processing-in-memory (PIM) is a promising solution to address the "memory wall" challenges for future computer systems. Prior proposed PIM architectures put additional computation logic in or near memory. The emerging metal-oxide resistive random access ...
Design of a large-scale storage-class RRAM system
ICS '13: Proceedings of the 27th international ACM conference on International conference on supercomputing

Resistive Random Access Memory (RRAM) is a promising next generation non-volatile memory (NVM) technology, thanks to its performance potential, endurance and ease-of-integration with standard silicon CMOS processes. While prior work has evaluated RRAM ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems

ACM Transactions on Design Automation of Electronic Systems Volume 28, Issue 1

January 2023

321 pages

ISSN:1084-4309

EISSN:1557-7309

DOI:10.1145/3573313

Editor:
X. Sharon Hu
University of Notre Dame, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 10 December 2022

Online AM: 27 April 2022

Accepted: 25 April 2022

Revised: 24 April 2022

Received: 21 October 2021

Published in TODAES Volume 28, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Research Foundation from NUDT

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
598
Total Downloads

Downloads (Last 12 months)190
Downloads (Last 6 weeks)18

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xian JXing YCai SLi WXiong XHu Z(2024)WCPNet: Jointly Predicting Wirelength, Congestion and Power for FPGA Using Multi-Task LearningACM Transactions on Design Automation of Electronic Systems10.1145/365617029:3(1-19)Online publication date: 3-May-2024
https://dl.acm.org/doi/10.1145/3656170
Chen XChen HYang C(2024)PointCIM: A Computing-in-Memory Architecture for Accelerating Deep Point Cloud Analytics2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00097(1309-1322)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00097
Zhang CTang XPeng Y(2024)A Novel Database Acceleration Technology for Full Table ScansIEEE Access10.1109/ACCESS.2024.345210412(127532-127544)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3452104
Huai SKong HLuo XLi SSubramaniam RMakaya CLin QLiu W(2023) CRIMP: C ompact & R eliable DNN Inference on I n- M emory P rocessing via Crossbar-Aligned Compression and Non-ideality Adaptation ACM Transactions on Embedded Computing Systems10.1145/360911522:5s(1-25)Online publication date: 9-Sep-2023
https://doi.org/10.1145/3609115
Xiao YXu QYuan B(2023)Tolerating Device-to-Device Variation for Memristive Crossbar-Based Neuromorphic Computing Systems: A New Bayesian Perspective2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191448(1-7)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191448

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents