skip to main content
10.1145/3453688.3461491acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration

Published: 22 June 2021 Publication History

Abstract

Most existing RRAM-based designs require expensive analog-to-digital converters (ADCs) digital-to-analog converters (DACs) and excessively occupied crossbars to achieve efficient acceleration. To reduce the overhead of DACs, the existing solution is to split the input into a bit sequence, but the MAC operation that can be completed by one cycle is forced to multiple cycles to the energy-efficiency decrease. For ADCs, it generally partitions the weight into multiple cells, resulting in an excessive number of crossbars or frequent writes on account of insufficient number. To solve this problem, we propose IM3A, an In-Memory Addressing-Assisted Acceleration scheme IM3A decompose MAC operations into multiplication and accumulation, which are implemented separately through the content-addressable and multiply-accumulated capabilities of the crossbar. The energy-efficiency is improved by the CAM crossbar supporting the parallel search of very large numbers of data bits, and the RRAM crossbar selectively enabling the rows to be read based on the hit result of the CAM search. Therefore, only the possibility of operands involved in MAC is deployed on the crossbar. Experimental results show that IM3A applied on various networks achieves system energy-efficiency improvement by 1.7x ∼ 15.9x over two state-of-the-art crossbar accelerators: ISAAC and PIM-Prune.

Supplemental Material

MP4 File
This is the representation video of our paper, ?IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration?. Our paper uses ternary content addressable memory (TCAM) and ReRAM crossbar to effectively implement the multiplication in deep neuron networks. TCAM is used to search for the index of the multiplication and ReRAM crossbar retrieve the product from its memory. By adopting our scheme, expensive digital-to-analog and analog-to-digital converters can be exempted from the architecture. In addition, the number of crossbar that are used to store all the weight matrices in existing schemes can be greatly saved by only store operand pairs and their result on TCAM and ReRAM crossbar respectively.

References

[1]
Rajeev Balasubramonian et almbox. 2017. CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories. TACO, Vol. 14, 2 (2017), 14.
[2]
Yi Cai et almbox. 2019. Low bit-width convolutional neural network on rram. TCAD, Vol. 39, 7 (2019), 1414--1427.
[3]
N Challapalle et almbox. 2020. GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation using Crossbar Architectures. In 2020 ISCA.
[4]
Teyuh Chou et almbox. 2019. Cascade: Connecting rrams to extend analog dataflow in an end-to-end in-memory processing paradigm. In MICRO. 114--125.
[5]
Chaoqun Chu et almbox. 2020. PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture. In 2020 DAC. IEEE, 1--6.
[6]
Xiangyu Dong et almbox. 2012. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory. TCAD, Vol. 31, 7 (2012), 994--1007.
[7]
Benoit Jacob et almbox. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In CVPR. 2704--2713.
[8]
Lukas Kull et almbox. 2013. A 3.1 mW 8b 1.2 GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32 nm digital SOI CMOS. JSSC, Vol. 48, 12 (2013), 3049--3058.
[9]
Jinmook Lee et almbox. 2019. UNPU: An energy-efficient deep neural network accelerator with fully variable weight bit precision. IEEE Journal of Solid-State Circuits, Vol. 54, 1 (2019), 173--185.
[10]
Yuhang Li, Xin Dong, and Wei Wang. 2019. Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks. arXiv preprint arXiv:1909.13144 (2019).
[11]
Ali Shafiee et almbox. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In ISCA.
[12]
Linghao Song et almbox. 2017. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In 2017 HPCA.
[13]
Zhuoran Song et almbox. 2020. DRQ: dynamic region-based quantization for deep neural network acceleration. In ISCA. IEEE, 1010--1021.
[14]
Synopsys. 2019. [Online]. Available: https://www.synopsys.com/community/university-program/teaching-resources.html. Accessed: 26-Jun-2019.
[15]
Rui Yang et almbox. 2019 a. Ternary content-addressable memory with mos 2 transistors for massively parallel data search. Nature Electronics, Vol. 2, 3 (2019), 108--114.
[16]
Tzu-Hsien Yang et almbox. 2019 b. Sparse reram engine: Joint exploration of activation and weight sparsity in compressed neural networks. In ISCA. 236--249.
[17]
Peng Yao et almbox. 2020. Fully hardware-implemented memristor convolutional neural network. Nature, Vol. 577, 7792 (2020), 641--646.
[18]
Jishen Zhao et almbox. 2016. PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory. Computer architecture news (2016).

Cited By

View all
  • (2023)SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317290742:1(204-217)Online publication date: Jan-2023
  • (2023)PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00084(507-514)Online publication date: 6-Nov-2023
  • (2023)HyAcc: A Hybrid CAM-MAC RRAM-based Accelerator for Recommendation Model2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00063(375-382)Online publication date: 6-Nov-2023
  • Show More Cited By

Index Terms

  1. IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      GLSVLSI '21: Proceedings of the 2021 Great Lakes Symposium on VLSI
      June 2021
      504 pages
      ISBN:9781450383936
      DOI:10.1145/3453688
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 June 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. neural network
      2. processing-in-memory
      3. resistive random-access memory
      4. ternary content-addressable memory

      Qualifiers

      • Research-article

      Data Availability

      This is the representation video of our paper, ?IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration?. Our paper uses ternary content addressable memory (TCAM) and ReRAM crossbar to effectively implement the multiplication in deep neuron networks. TCAM is used to search for the index of the multiplication and ReRAM crossbar retrieve the product from its memory. By adopting our scheme, expensive digital-to-analog and analog-to-digital converters can be exempted from the architecture. In addition, the number of crossbar that are used to store all the weight matrices in existing schemes can be greatly saved by only store operand pairs and their result on TCAM and ReRAM crossbar respectively. https://dl.acm.org/doi/10.1145/3453688.3461491#GLSVLSI21-fp033.mp4

      Funding Sources

      Conference

      GLSVLSI '21
      Sponsor:
      GLSVLSI '21: Great Lakes Symposium on VLSI 2021
      June 22 - 25, 2021
      Virtual Event, USA

      Acceptance Rates

      Overall Acceptance Rate 312 of 1,156 submissions, 27%

      Upcoming Conference

      GLSVLSI '25
      Great Lakes Symposium on VLSI 2025
      June 30 - July 2, 2025
      New Orleans , LA , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 07 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317290742:1(204-217)Online publication date: Jan-2023
      • (2023)PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00084(507-514)Online publication date: 6-Nov-2023
      • (2023)HyAcc: A Hybrid CAM-MAC RRAM-based Accelerator for Recommendation Model2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00063(375-382)Online publication date: 6-Nov-2023
      • (2022)Self-terminating write of multi-level cell ReRAM for efficient neuromorphic computingProceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe10.5555/3539845.3540141(1251-1256)Online publication date: 14-Mar-2022
      • (2022)Self-Terminating Write of Multi-Level Cell ReRAM for Efficient Neuromorphic Computing2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774550(1251-1256)Online publication date: 14-Mar-2022
      • (2022)TransHash: Transformer-based Hamming Hashing for Efficient Image RetrievalProceedings of the 2022 International Conference on Multimedia Retrieval10.1145/3512527.3531405(127-136)Online publication date: 27-Jun-2022
      • (2022)DynSNN: A Dynamic Approach to Reduce Redundancy in Spiking Neural NetworksICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP43922.2022.9746566(2130-2134)Online publication date: 23-May-2022
      • (2022)HAWIS: Hardware-Aware Automated WIdth Search for Accurate, Energy-Efficient and Robust Binary Neural Network on ReRAM Dot-Product EngineProceedings of the 27th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC52403.2022.9712542(226-231)Online publication date: 17-Jan-2022
      • (2021)SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00072(417-424)Online publication date: Oct-2021
      • (2021)Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00054(290-298)Online publication date: Oct-2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media