short-paper

In-Memory Computing: The Next-Generation AI Computing Paradigm

Authors:

Zhongfeng WangAuthors Info & Claims

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

Pages 265 - 270

https://doi.org/10.1145/3386263.3407588

Published: 07 September 2020 Publication History

Abstract

To overcome the memory bottleneck of von-Neuman architecture, various memory-centric computing techniques are emerging to reduce the latency and energy consumption caused by data communication. The great success of artificial intelligence (AI) algorithms, which involve a large number of computations and data movements, has motivated and accelerated the recent researches of in-memory computing (IMC) techniques to significantly reduce or even diminish the accesses of off-chip data, where memory is not only storing data but can also directly output computation results. For example, the multiply-and-accumulate (MAC) operations in deep learning algorithms can be realized by accessing the memory using the input activations. This paper will investigate the recent trends of IMC from techniques (SRAM, flash, RRAM and other types of non-volatile memory) to architecture and to applications, which will serve as a guide to the future advances on computing in-memory (CIM).

Supplementary Material

MP4 File (3386263.3407588.mp4)

Presentation video

Download
22.62 MB

References

[1]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in Adv. Neural Inf. Process. Syst., 2012.

[2]

K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in Proc. Int. Conf. Learn. Represent. (ICLR), 2015.

[3]

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "XNOR-Net: ImageNet classification using binary convolutional neural networks," in ECCV, 2016.

[4]

J. Zhang, Z. Wang and N. Verma, "In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array," in IEEE Journal of Solid-State Circuits, vol. 52, no. 4, pp. 915--924, April 2017.

[5]

S. K. Gonugondla, M. Kang and N. R. Shanbhag, "A Variation-Tolerant In-Memory Machine Learning Classifier via On-Chip Training," in IEEE Journal of Solid-State Circuits, vol. 53, no. 11, pp. 3163--3173, Nov. 2018.

[6]

M. Kang, S. K. Gonugondla, A. Patil and N. R. Shanbhag, "A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array," in IEEE Journal of Solid-State Circuits, vol. 53, no. 2, pp. 642--655, Feb. 2018.

[7]

W. Simon, J. Galicia, A. Levisse, M. Zapater and D. Atienza, "A Fast, Reliable and Wide-Voltage-Range In-Memory Computing Architecture," in ACM/IEEE Design Automation Conference (DAC), 2019.

[8]

J. Yang et al., "Sandwich-RAM: An Energy-Efficient In-Memory BWN Architecture with Pulse-Width Modulation," in IEEE ISSCC, 2019.

[9]

A. Agrawal et al., "Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 8, pp. 3064--3076, Aug. 2019.

[10]

A. Biswas and A. P. Chandrakasan, "CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks," in IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 217--230, Jan. 2019.

[11]

H. Valavi, P. J. Ramadge, E. Nestler and N. Verma, "A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute," in IEEE Journal of Solid-State Circuits, vol. 54, no. 6, June 2019.

[12]

X. Si et al., "A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors," in IEEE Journal of Solid-State Circuits, vol. 55, no. 1, pp. 189--202, Jan. 2020.

[13]

J. Wang et al., "A 28-nm Compute SRAM with Bit-Serial Logic/Arithmetic Operations for Programmable In-Memory Vector Computing," in IEEE Journal of Solid-State Circuits, vol. 55, no. 1, pp. 76--86, Jan. 2020.

[14]

S. Yin, Z. Jiang, J. Seo and M. Seok, "XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks," in IEEE Journal of Solid-State Circuits, Jan. 2020.

[15]

S. Li, D. Niu, K. T. Malladi, H. Zheng, B. Brennan and Y. Xie, "DRISA: A DRAM-based Reconfigurable In-Situ Accelerator," in IEEE/ACM Int. Symp. on Microarchitecture (MICRO), 2017.

[16]

T. Yoo, H. Kim, Q. Chen, T. T. Kim and B. Kim, "A Logic Compatible 4T Dual Embedded DRAM Array for In-Memory Computation of Deep Neural Networks," in IEEE/ACM ISLPED, 2019.

[17]

J. Jeddeloh and B. Keeth, "Hybrid Memory Cube New DRAM Architecture Increases Density and Performance," in VLSIT, June 2012.

[18]

S. Chung and J. Wang, "Tightly Coupled Machine Learning Coprocessor Architecture with Analog In-Memory Computing for Instruction-Level Acceleration," in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 3, pp. 544--561, Sept. 2019.

[19]

C. Xue et al., "A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors," in IEEE ISSCC, 2019.

[20]

D. Patil, et al., "An MRAM-Based Deep In-Memory Architecture for Deep Neural Networks," in IEEE ISCAS, 2019.

[21]

F. M. Bayat, X. Guo, M. Klachko, N. Do, K. Likharev and D. Strukov, "Model-based high-precision tuning of NOR flash memory cells for analog computing applications," in Device Research Conference (DRC), Newark, DE, 2016.

[22]

J. F. Kang, P. Huang, R. Z. Han, Y. C. Xiang, X. L. Cui and X. Y. Liu, "Flash-based Computing in-Memory Scheme for IOT," 2019 IEEE 13th International Conference on ASIC (ASICON), Chongqing, China, 2019, pp. 1--4.

[23]

H. Lue et al., "Optimal Design Methods to Transform 3D NAND Flash into a High-Density, High-Bandwidth and Low-Power Nonvolatile Computing in Memory (nvCIM) Accelerator for Deep-Learning Neural Networks (DNN)," in IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2019, pp. 38.1.1--38.1.4.

[24]

P. Chi et al., "PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory," in ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, 2016, pp. 27--39.

[25]

R. Mochida et al., "A 4M Synapses integrated Analog ReRAM based 66.5 TOPS/W Neural-Network Processor with Cell Current Controlled Writing and Flexible Network Architecture," in IEEE Symposium on VLSI Technology, Honolulu, HI, 2018, pp. 175--176.

[26]

C. Xue et al., "24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors," in IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2019, pp. 388--390.

[27]

Huangfu, W.; Xia, L.; Cheng, M.; Yin, X.; Tang, T.; Li, B.; Chakrabarty, K.; Xie, Y.; Wang, Y.; Yang, H. Computation-oriented fault-tolerance schemes for RRAM computing systems. In Asia and South Pacific Design Automation Conference (ASP-DAC), Chiba, Japan, 16--19 January 2017; pp. 794--799

[28]

S. G. Ramasubramanian, R. Venkatesan, M. Sharad, K. Roy and A. Raghunathan, "SPINDLE: SPINtronic Deep Learning Engine for large-scale neuromorphic computing," in 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), La Jolla, CA, 2014, pp. 15--20.

[29]

S. Jain, A. Ranjan, K. Roy and A. Raghunathan, "Computing in Memory With Spin-Transfer Torque Magnetic RAM," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 3, pp. 470--483, March 2018.

Digital Library

[30]

Y. Pan et al., "A Multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for Binary Convolutional Neural Network," in IEEE Transactions on Magnetics, vol. 54, no. 11, pp. 1--5, Nov. 2018, Art no. 9401305.

[31]

P. Hosseini, A. Sebastian, N. Papandreou, C. D. Wright and H. Bhaskaran, "Accumulation-Based Computing Using Phase-Change Memories with FET Access Devices," in IEEE Electron Device Letters, vol. 36, no. 9, pp. 975--977, Sept. 2015.

[32]

Sebastian, Abu, et al. "Tutorial: Brain-inspired computing using phase-change memory devices." Journal of Applied Physics 124.11(2018).

[33]

Sebastian, Abu, et al. "Computational phase-change memory: Beyond von Neumann computing. " Journal of Physics D: Applied Physics. 52(2019).

[34]

I. Giannopoulos et al., "8-bit Precision In-Memory Multiplication with Projected Phase-Change Memory," in IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, 2018, pp. 27.7.1--27.7.4.

[35]

J. S. Vetter and S. Mittal, "Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing," in Computing in Science & Engineering, vol. 17, no. 2, pp. 73--82, Mar.-Apr. 2015.

Digital Library

[36]

Burr, Geoffrey W., et al. "Neuromorphic computing using non-volatile memory." Advances in Physics: X 2.1(2017):89--124.

[37]

C. Dou et al., "Nonvolatile Circuits-Devices Interaction for Memory, Logic and Artificial Intelligence," 2018 IEEE Symposium on VLSI Technology, Honolulu, HI, 2018, pp. 171--172.

[38]

Y. Du et al., "An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT)," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38, no. 10, pp. 1811--1819, Oct. 2019.

Digital Library

Cited By

Daddinounou SGebregiorgis AHamdioui SVatajelu E(2025)SPICE-Level Demonstration of Unsupervised Learning With Spintronic Synapses in Spiking Neural NetworksIEEE Access10.1109/ACCESS.2024.341151913(6845-6854)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3411519
Rajput APattanaik MKaushal G(2024)BP-IMCA: An Energy-Efficient 8T SRAM Based Bit-Parallel In-Memory Computing ArchitectureJournal of Circuits, Systems and Computers10.1142/S0218126625501245Online publication date: 18-Oct-2024
https://doi.org/10.1142/S0218126625501245
Sarkar MChowdhury SWalling JYi C(2024)An In-Memory Power Efficient Computing Architecture with Emerging VGSOT MRAM Device2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10557835(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10557835
Show More Cited By

Index Terms

In-Memory Computing: The Next-Generation AI Computing Paradigm
1. Computing methodologies
  1. Machine learning
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
  2. Very large scale integration design
    1. Application-specific VLSI designs

Recommendations

Modeling, Architecture, and Applications for Emerging Memory Technologies

Editor's note:Spin-transfer torque RAM and phase-change RAM are vying to become the next-generation embedded memory, offering high speed, high density, and nonvolatility. This article discusses new opportunities and challenges presented by these two ...
DEUCE: Write-Efficient Encryption for Non-Volatile Memories
ASPLOS'15

Phase Change Memory (PCM) is an emerging Non Volatile Memory (NVM) technology that has the potential to provide scalable high-density memory systems. While the non-volatility of PCM is a desirable property in order to save leakage power, it also has the ...
A survey on techniques for improving Phase Change Memory (PCM) lifetime
Abstract
PCMs are Non-Volatile Memories (NVMs) that store data using phase-change semiconductors, such as silicon-chalcogenide glass. In addition to increased integration density, PCMs have high durability and data transfer rates and consume less power ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

September 2020

597 pages

ISBN:9781450379441

DOI:10.1145/3386263

General Chairs:
Tinoosh Mohsenin
University of Maryland, Baltimore County, USA
,
Weisheng Zhao
Beihang University, China
,
Program Chairs:
Yiran Chen
Duke University, USA
,
Onur Mutlu
ETH Zurich, Switzerland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Natural Science Foundation of China

Conference

GLSVLSI '20

GLSVLSI '20: Great Lakes Symposium on VLSI 2020

September 7 - 9, 2020

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
950
Total Downloads

Downloads (Last 12 months)192
Downloads (Last 6 weeks)23

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Daddinounou SGebregiorgis AHamdioui SVatajelu E(2025)SPICE-Level Demonstration of Unsupervised Learning With Spintronic Synapses in Spiking Neural NetworksIEEE Access10.1109/ACCESS.2024.341151913(6845-6854)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3411519
Rajput APattanaik MKaushal G(2024)BP-IMCA: An Energy-Efficient 8T SRAM Based Bit-Parallel In-Memory Computing ArchitectureJournal of Circuits, Systems and Computers10.1142/S0218126625501245Online publication date: 18-Oct-2024
https://doi.org/10.1142/S0218126625501245
Sarkar MChowdhury SWalling JYi C(2024)An In-Memory Power Efficient Computing Architecture with Emerging VGSOT MRAM Device2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10557835(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10557835
Cao MYu NZhang HLv NYang YHan C(2024)A High Linearity Current-Mirror MAC Circuit Design for SRAM Computing-in-Memory2024 9th International Conference on Integrated Circuits and Microsystems (ICICM)10.1109/ICICM63644.2024.10814160(712-716)Online publication date: 25-Oct-2024
https://doi.org/10.1109/ICICM63644.2024.10814160
Jangra PDuhan M(2024)In-memory computing: characteristics, spintronics, and neural network applications insightsMultiscale and Multidisciplinary Modeling, Experiments and Design10.1007/s41939-024-00517-07:6(5005-5029)Online publication date: 9-Jul-2024
https://doi.org/10.1007/s41939-024-00517-0
Zhu SDuong LChen HLiu DLiu W(2023)FAT: An In-Memory Accelerator With Fast Addition for Ternary Weight Neural NetworksIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.318427642:3(781-794)Online publication date: Mar-2023
https://doi.org/10.1109/TCAD.2022.3184276
Zhu SHuai SXiong GLiu W(2023)iMAT: Energy-Efficient In-Memory Acceleration for Ternary Neural Networks With Sparse Dot Product2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)10.1109/ISLPED58423.2023.10244333(1-6)Online publication date: 7-Aug-2023
https://doi.org/10.1109/ISLPED58423.2023.10244333
Xu XShui YWang A(2023) A 0.0025mm 2 8-bit 70MS/s SAR ADC with a Linearity-Improved Bootstrapped Switch for Computation in Memory 2023 8th International Conference on Integrated Circuits and Microsystems (ICICM)10.1109/ICICM59499.2023.10365851(412-416)Online publication date: 20-Oct-2023
https://doi.org/10.1109/ICICM59499.2023.10365851
Rajput APattanaik M(2023)Local bit line 8T SRAM based in-memory computing architecture for energy-efficient linear error correction codec implementationMicroelectronics Journal10.1016/j.mejo.2023.105795137(105795)Online publication date: Jul-2023
https://doi.org/10.1016/j.mejo.2023.105795
Rajput ATiwari APattanaik M(2023)An Energy-Efficient Hybrid SRAM-Based In-Memory Computing Macro for Artificial Intelligence Edge DevicesCircuits, Systems, and Signal Processing10.1007/s00034-022-02284-042:6(3589-3616)Online publication date: 14-Jan-2023
https://doi.org/10.1007/s00034-022-02284-0
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents