skip to main content
10.1145/3400302.3415679acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

Hessian-driven unequal protection of DNN parameters for robust inference

Published: 17 December 2020 Publication History

Abstract

This paper presents an algorithmic approach to design reliable deep neural networks (DNN) in the presence of stochastic variations in the network parameters induced by process variations in the bit-cells in a processing-in-memory (PIM) architecture. We propose and derive a Hessian based sensitivity metric that can be computed without computing or storing the full Hessian to identify and protect the "important" network parameters while allowing large variations in unprotected parameters. Experiments on modern DNNs like ResNet, MobileNetv2, DenseNet on CIFAR10 demonstrates that by shielding only a small (1% -- 5%) fraction of parameters one can achieve less than 1% accuracy degradation even under large (50%) stochastic variations in other parameters.

References

[1]
W. Chen et al. 2018. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In ISSCC 2018. 494--496.
[2]
Ping Chi et al. 2016. PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory. In ISCA. 27--39.
[3]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
[4]
Xin Dong, Shangyu Chen, and Sinno Jialin Pan. 2017. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon. CoRR abs/1705.07565 (2017). arXiv:1705.07565 http://arxiv.org/abs/1705.07565
[5]
Eckert et al. 2018. Neural Cache: Bit-serial In-cache Acceleration of Deep Neural Networks. In ISCA. 383--396.
[6]
Noah Golmant et al. [n.d.]. pytorch-hessian-eigenthings: efficient PyTorch Hessian eigendecomposition.
[7]
S. K. Gonugondla, M. Kang, and N. R. Shanbhag. 2018. A Variation-Tolerant In-Memory Machine Learning Classifier via On-Chip Training. IEEE Journal of Solid-State Circuits 53, 11 (2018), 3163--3173.
[8]
Babak Hassibi et al. 1993. Optimal Brain Surgeon: Extensions and Performance Comparisons. In NeurIPS.
[9]
Kaiming He et al. 2015. Deep Residual Learning for Image Recognition. CVPR (2015), 770--778.
[10]
Gao Huang, Zhuang Liu, and Kilian Q. Weinberger. 2016. Densely Connected Convolutional Networks. CoRR abs/1608.06993 (2016). arXiv:1608.06993 http://arxiv.org/abs/1608.06993
[11]
D. Kim et al. 2017. A Power-Aware Digital Multilayer Perceptron Accelerator with On-Chip Training Based on Approximate Computing. TETC 5, 2 (2017), 164--178.
[12]
Jong Hwan Ko et al. 2017. Adaptive Weight Compression for Memory-efficient Neural Networks. In DATE. 199--204.
[13]
Raghuraman Krishnamoorthi. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. CoRR abs/1806.08342 (2018). arXiv:1806.08342 http://arxiv.org/abs/1806.08342
[14]
Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. [n.d.]. CIFAR-10. ([n. d.]). http://www.cs.toronto.edu/~kriz/cifar.html
[15]
Levent Sagun et al. [n.d.]. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks. CoRR abs/1706.04454 ([n. d.]).
[16]
H. Li, Z. Jiang, P. Huang, Y. Wu, H. Chen, B. Gao, X. Y. Liu, J. F. Kang, and H. P. Wong. 2015. Variation-aware, reliability-emphasized design and optimization of RRAM using SPICE model. In 2015 Design, Automation Test in Europe Conference Exhibition (DATE). 1425--1430.
[17]
M. Lin, H. Cheng, W. Lin, T. Yang, I. Tseng, C. Yang, H. Hu, H. Chang, H. Li, and M. Chang. 2018. DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 1--8.
[18]
Yun Long et al. 2018. A Ferroelectric FET Based Power-efficient Architecture for Data-intensive Computing. In Proceedings of the International Conference on Computer-Aided Design (ICCAD '18).
[19]
Y. Long et al. 2019. A Ferroelectric FET based Processing-in-Memory Architecture for DNN Acceleration. IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (2019), 1--1.
[20]
Yun Long, Taesik Na, Prakshi Rastogi, Karthik Rao, Asif Islam Khan, Sudhakar Yalamanchili, and Saibal Mukhopadhyay. 2018. A Ferroelectric FET Based Power-Efficient Architecture for Data-Intensive Computing. In Proceedings of the International Conference on Computer-Aided Design (San Diego, California) (ICCAD '18). Association for Computing Machinery, New York, NY, USA, Article 32, 8 pages.
[21]
Yun Long, Xueyuan She, and Saibal Mukhopadhyay. 2019. Design of Reliable DNN Accelerator with Un-reliable ReRAM. DATE (2019), 1769--1774.
[22]
S. Mukhopadhyay, H. Mahmoodi, and K. Roy. 2005. Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 12 (2005), 1859--1880.
[23]
Quynh Nguyen. 2019. On Connected Sublevel Sets in Deep Learning. In ICML, Vol. 97. 4790--4799.
[24]
K. Ni, W. Chakraborty, J. Smith, B. Grisafe, and S. Datta. 2019. Fundamental Understanding and Control of Device-to-Device Variation in Deeply Scaled Ferroelectric FETs. In 2019 Symposium on VLSI Technology. T40--T41.
[25]
Adam Paszke et al. 2017. Automatic differentiation in PyTorch. (2017).
[26]
E. Qin, A. Samajdar, H. Kwon, V. Nadella, S. Srinivasan, D. Das, B. Kaul, and T. Krishna. 2020. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). 58--70.
[27]
Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. CoRR abs/1801.04381 (2018). arXiv:1801.04381 http://arxiv.org/abs/1801.04381
[28]
Ali Shafiee et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-situ Analog Arithmetic in Crossbars. In ISCA. 14--26.
[29]
L. Song, X. Qian, H. Li, and Y. Chen. 2017. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In HPCA. 541--552.
[30]
J. Wang, X. Wang, C. Eckert, A. Subramaniyan, R. Das, D. Blaauw, and D. Sylvester. 2019. 14.2 A Compute SRAM with Bit-Serial Integer/Floating-Point Operations for Programmable In-Memory Vector Acceleration. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 224--226.
[31]
Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. 2018. HAQ: Hardware-Aware Automated Quantization with Mixed Precision. arXiv:cs.CV/1811.08886
[32]
Peng Xu et al. 2018. Accelerated Stochastic Power Iteration. In AISTATS, Vol. 84. 58--67.
[33]
C. Xue, W. Chen, J. Liu, J. Li, W. Lin, W. Lin, J. Wang, W. Wei, T. Chang, T. Chang, T. Huang, H. Kao, S. Wei, Y. Chiu, C. Lee, C. Lo, Y. King, C. Lin, R. Liu, C. Hsieh, K. Tang, and M. Chang. 2019. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 388--390.
[34]
J. Zhang, Z. Wang, and N. Verma. 2017. In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array. IEEE Journal of Solid-State Circuits 52, 4 (2017), 915--924.

Cited By

View all
  • (2024)Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic QuantizationProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698553(1012-1031)Online publication date: 20-Nov-2024
  • (2024)Hessian-Aware KV Cache Quantization for LLMs2024 IEEE 67th International Midwest Symposium on Circuits and Systems (MWSCAS)10.1109/MWSCAS60917.2024.10658840(243-247)Online publication date: 11-Aug-2024
  • (2023)Intriguing properties of quantization at scaleProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667608(34278-34294)Online publication date: 10-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design
November 2020
1396 pages
ISBN:9781450380263
DOI:10.1145/3400302
  • General Chair:
  • Yuan Xie
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE CAS
  • IEEE CEDA
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. processing-in-memory
  3. robustness
  4. variation

Qualifiers

  • Research-article

Conference

ICCAD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic QuantizationProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698553(1012-1031)Online publication date: 20-Nov-2024
  • (2024)Hessian-Aware KV Cache Quantization for LLMs2024 IEEE 67th International Midwest Symposium on Circuits and Systems (MWSCAS)10.1109/MWSCAS60917.2024.10658840(243-247)Online publication date: 11-Aug-2024
  • (2023)Intriguing properties of quantization at scaleProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667608(34278-34294)Online publication date: 10-Dec-2023
  • (2023)DEA-NIMC: Dynamic Energy-Aware Policy for Near/In-Memory Computing Hybrid Architecture2023 IEEE 36th International System-on-Chip Conference (SOCC)10.1109/SOCC58585.2023.10256898(1-6)Online publication date: 5-Sep-2023
  • (2023)BWA-NIMC: Budget-based Workload Allocation for Hybrid Near/In-Memory-Computing2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247819(1-6)Online publication date: 9-Jul-2023
  • (2022)CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM2022 IEEE 40th International Conference on Computer Design (ICCD)10.1109/ICCD56317.2022.00049(280-289)Online publication date: Oct-2022
  • (2021)Reliable Edge Intelligence in Unreliable Environment2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE51398.2021.9474097(896-901)Online publication date: 1-Feb-2021
  • (2021)Impact of HKMG and FDSOI FeFET drain current variation in processing-in-memory architecturesJournal of Materials Research10.1557/s43578-021-00393-1Online publication date: 28-Sep-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media