SpinCIM: spin orbit torque memory for ternary neural networks based on the computing-in-memory architecture

Luo, Lichuan; Liu, Dijun; Zhang, He; Zhang, Youguang; Bai, Jinyu; Kang, Wang

doi:10.1007/s42514-022-00108-w

SpinCIM: spin orbit torque memory for ternary neural networks based on the computing-in-memory architecture

Regular Paper
Published: 24 June 2022

Volume 4, pages 421–434, (2022)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Lichuan Luo ORCID: orcid.org/0000-0001-9808-1344¹,
Dijun Liu¹,
He Zhang²,
Youguang Zhang¹,
Jinyu Bai³ &
…
Wang Kang³

315 Accesses
3 Citations
Explore all metrics

Abstract

Although significant breakthrough has been made in deep neural networks (DNNs), which show impressive potential as a general solution to the field of artificial intelligence (AI), DNN computing tasks generally need billions of floating-point multiplication and accumulation (MAC) operations, bringing great challenges on hardware resource, power consumption, and communication bandwidth. Computing-in-memory (CIM) architecture, especially the one based on spintronic memories, which integrates the memory and computing together, shows fascinating prospects in DNNs for its high energy efficiency and good endurance. In this work, we leveraged coupled magnetic tunnel junctions (MTJs), which are driven by the interplay of field-free spin orbit torque (SOT) and spin transfer torque (STT) effects, to realize two different innovative stateful CIM paradigms for ternary MAC operations. Based on both paradigms, we further demonstrated the highly parallel array structures to implement a memory array supportive of functioning both as memory and CIM for ternary neural networks (TNNs). Our results demonstrated that the area overhead for CIM is only about 0.8% of the memory array. The advantage of our design in power consumption was illustrated in comparison with the CPU, GPU and other state-of-the-art works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration

Article 09 February 2023

Analog In-Memory Computing with SOT-MRAM: Architecture and Circuit Challenges

Nanoscale resistive switching devices for memory and computing applications

Article 17 January 2020

References

Agrawal, A., Jaiswal, A., Roy, D., et al.: Xcel-ram: accelerating binary neural networks in high-throughput sram compute arrays. IEEE Trans. Circuits Syst. I Regul. Pap. 66(8), 3064–3076 (2019). https://doi.org/10.1109/TCSI.2019.2907488
Article Google Scholar
Ben-Hur, R., Ronen, R., Haj-Ali, A., et al.: Simpler magic: Synthesis and mapping of in-memory logic executed in a single row to improve throughput. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(10), 2434–2447 (2020). https://doi.org/10.1109/TCAD.2019.2931188
Article Google Scholar
Bocquet, M., Hirztlin, T., Klein, J., et al.: In-memory and error-immune differential rram implementation of binarized deep neural networks. In: 2018 IEEE International Electron Devices Meeting (IEDM). pp. 20.6.1–20.6.4 (2018). https://doi.org/10.1109/IEDM.2018.8614639.
Cai, H., Jiang, H., Zhou, Y., et al.: Interplay bitwise operation in emerging mram for efficient in-memory computing. CCF Trans High Perform Comput 2(3), 282–296 (2020). https://doi.org/10.1007/s42514-020-00045-6
Article Google Scholar
Chang, L., Ma, X., Wang, Z., et al.: Dasm: data-streaming-based computing in nonvolatile memory architecture for embedded system. IEEE Trans Very Large Scale Integration (VLSI) Syst 27(9), 2046–2059 (2019a)
Article Google Scholar
Chang, L., Ma, X., Wang, Z., et al.: Pxnor-bnn: In/with spin-orbit torque mram preset-xnor operation-based binary neural networks. IEEE Trans Very Large Scale Integration (VLSI) Syst 27(11), 2668–2679 (2019b)
Article Google Scholar
Chen, W. H., Li, K. X., Lin, W. Y., et al.: A 65nm 1mb nonvolatile computing-in-memory reram macro with sub-16ns multiply-and-accumulate for binary dnn ai edge processors. In: 2018 IEEE International Solid - State Circuits Conference - (ISSCC). pp. 494–496 (2018). https://doi.org/10.1109/ISSCC.2018.8310400.
Chen, H. M., Ni, C. E., Chang, K. Y., et al.: On reconfiguring memory-centric ai edge devices for cim. In: 2021 18th International SoC Design Conference (ISOCC). pp. 262–263 (2021). https://doi.org/10.1109/ISOCC53507.2021.9613893.
Cofano, M., Vacca, M., Santoro, G., et al.: Exploiting the logic-in-memory paradigm for speeding-up data-intensive algorithms. Integration. (2019). https://doi.org/10.1016/j.vlsi.2019.02.007
Article Google Scholar
Dong, X., Xu, C., Xie, Y., et al.: Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 31(7), 994–1007 (2012). https://doi.org/10.1109/TCAD.2012.2185930
Article Google Scholar
Gallo, M.L., Sebastian, A., Cherubini, G., et al.: Compressed sensing with approximate message passing using in-memory computing. IEEE Trans. Electron Devices 65(10), 4304–4312 (2018). https://doi.org/10.1109/TED.2018.2865352
Article Google Scholar
Hamdioui, S., Nguyen, H. A. D., Taouil, M., et al.: Applications of computation-in-memory architectures based on memristive devices. In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). pp. 486–491 (2019). https://doi.org/10.23919/DATE.2019.8715020.
Huang, S., Jiang, H., Peng, X., et al.: Xor-cim: Compute-in-memory sram architecture with embedded xor encryption. In: 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD). pp. 1–6 (2020).
Ielmini, D., Wong, H.S.P.: In-memory computing with resistive switching devices. Nat. Electron. 1(6), 333–343 (2018). https://doi.org/10.1038/s41928-018-0092-2
Article Google Scholar
Jiang, Z., Yin, S., Seo, J.-s., et al.: Xnor-sram. In: Proceedings of the 2019 on Great Lakes Symposium on VLSI. pp. 417–422 (2019). https://doi.org/10.1145/3299874.3319458.
Kang, W., Ran, Y., Zhang, Y., et al.: Modeling and exploration of the voltage-controlled magnetic anisotropy effect for the next-generation low-power and high-speed mram applications. IEEE Trans. Nanotechnol. 16(3), 387–395 (2017). https://doi.org/10.1109/TNANO.2017.2660530
Article Google Scholar
Kang, W., Zhang, H., Zhao, W.: Spintronic memories: From memory to computing-in-memory. In: 2019 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH). pp. 1–2 (2019). https://doi.org/10.1109/NANOARCH47378.2019.181298.
Kang, W., Deng, E., Wang, Z., et al.: Spintronic logic-in-memory paradigms and implementations. 63. pp. 215–229 (2020). https://doi.org/10.1007/978-981-13-8379-3_9.
Keckler, S.W., Dally, W.J., Khailany, B., et al.: Gpus and the future of parallel computing. IEEE Micro 31(5), 7–17 (2011). https://doi.org/10.1109/MM.2011.89
Article Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep, 1(pp. (2009).
Lecun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
Li, H., Gao, B., Chen, Z., et al.: A learnable parallel processing architecture towards unity of memory and computing. Sci. Rep. 5(1), 13330 (2015). https://doi.org/10.1038/srep13330
Article Google Scholar
Li, Z., Wang, Z., Xu, L., et al.: Rram-dnn: An rram and model-compression empowered all-weights-on-chip dnn accelerator. IEEE J. Solid-State Circuits 56(4), 1105–1115 (2021). https://doi.org/10.1109/JSSC.2020.3045369
Article Google Scholar
Li, S., Xu, C., Zou, Q., et al.: Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In: 2016a 53nd ACM/EDAC/IEEE Design Automation Conference (DAC). pp. 1–6 (2016a). https://doi.org/10.1145/2897937.2898064.
Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv e-prints, (2016b). arXiv:1605.04711.
Liang, S., Yin, S., Liu, L., et al.: Fp-bnn: Binarized neural network on fpga. Neurocomputing 275, 1072–1086 (2018). https://doi.org/10.1016/j.neucom.2017.09.046
Article Google Scholar
Linn, E., Rosezin, R., Tappertzhofen, S., et al.: Beyond von neumann–logic operations in passive crossbar arrays alongside memory operations. Nanotechnology 23(30), 305205 (2012). https://doi.org/10.1088/0957-4484/23/30/305205
Article Google Scholar
Luo, L., Zhang, H., Bai, J., et al.: Spinlim: Spin orbit torque memory for ternary neural networks based on the logic-in-memory architecture. In: 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). pp. 1865–1870 (2021). https://doi.org/10.23919/DATE51398.2021.9474022.
Oh, H., Kim, H., Kang, N., et al.: Single rram cell-based in-memory accelerator architecture for binary neural networks. In: 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS). pp. 1–4 (2021). https://doi.org/10.1109/AICAS51828.2021.9458444.
Pan, Y., Jia, X., Cheng, Z., et al.: An stt-mram based reconfigurable computing-in-memory architecture for general purpose computing. CCF Trans. High Perform. Comput. 2(3), 272–281 (2020). https://doi.org/10.1007/s42514-020-00038-5
Article Google Scholar
Qin, H., Gong, R., Liu, X., et al.: Forward and backward information retention for accurate binary neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2247–2256 (2020). https://doi.org/10.1109/CVPR42600.2020.00232.
Qiu, K., Chen, W., Xu, Y., et al.: A peripheral circuit reuse structure integrated with a retimed data flow for low power rram crossbar-based cnn. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). pp. 1057–1062 (2018). https://doi.org/10.23919/DATE.2018.8342168.
Resch, S., Khatamifard, S.K., Chowdhury, Z.I., et al.: Pimball: Binary neural networks in spintronic memory. ACM Trans. Arch. Code Optimiz. 16(4), 1–26 (2018)
Google Scholar
Santoro, G., Turvani, G., Graziano, M.: New logic-in-memory paradigms: an architectural and technological perspective. Micromachines 10(6), 368 (2019). https://doi.org/10.3390/mi10060368
Article Google Scholar
Sebastian, A., Gallo, M., Burr, G., et al.: Tutorial: brain-inspired computing using phase-change memory devices. J. Appl. Phys. 124(11), 111101 (2018). https://doi.org/10.1063/1.5042413
Article Google Scholar
Shen, W., Huang, P., Fan, M., et al.: Stateful logic operations in one-transistor-one- resistor resistive random access memory array. IEEE Electron Device Lett. 40(9), 1538–1541 (2019). https://doi.org/10.1109/LED.2019.2931947
Article Google Scholar
Si, X., Chang, M.-F., Khwa, W.-S., et al.: A dual-split 6t sram-based computing-in-memory unit-macro with fully parallel product-sum operation for binarized dnn edge processors. IEEE Trans. Circuits Syst. I Regul. Pap. 66(11), 4172–4185 (2019). https://doi.org/10.1109/tcsi.2019.2928043
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science. https://doi.org/10.48550/arXiv.1409.1556 (2014).
Wang, Z., Su, Y., Li, Y., et al.: Functionally complete boolean logic in 1t1r resistive random access memory. IEEE Electron Device Lett. 38(2), 179–182 (2017). https://doi.org/10.1109/LED.2016.2645946
Article MathSciNet Google Scholar
Wang, M., Cai, W., Zhu, D., et al.: Field-free switching of a perpendicular magnetic tunnel junction through the interplay of spin-orbit and spin-transfer torques. Nat. Electron. 1(11), 582–588 (2018). https://doi.org/10.1038/s41928-018-0160-7
Article Google Scholar
Wang, H., Kang, W., Zhang, L., et al.: High-density, low-power voltage-control spin orbit torque memory with synchronous two-step write and symmetric read techniques. In: 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). pp. 1217–1222 (2020). https://doi.org/10.23919/DATE48585.2020.9116576.
Xu, L., Yuan, R., Zhu, Z., et al.: Memristor-based efficient in-memory logic for cryptologic and arithmetic applications. Adv. Mater. Technol. (2019). https://doi.org/10.1002/admt.201900212
Article Google Scholar
Yin, S., Ouyang, P., Yang, J., et al.: An energy-efficient reconfigurable processor for binary-and ternary-weight neural networks with flexible data bit width. IEEE J. Solid-State Circuits 54(4), 1120–1136 (2019). https://doi.org/10.1109/jssc.2018.2881913
Article Google Scholar
Yuan, Z., Yue, J., Yang, H., et al.: Sticker: A 0.41–62.1 tops/w 8bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In: 2018 IEEE Symposium on VLSI Circuits. pp. 33–34 (2018). https://doi.org/10.1109/VLSIC.2018.8502404.
Yue, J., Feng, X., He, Y., et al.: 15.2 a 2.75-to-75.9tops/w computing-in-memory nn processor supporting set-associate block-wise zero skipping and ping-pong cim with simultaneous computation and weight updating. In: 2021 IEEE International Solid- State Circuits Conference (ISSCC). pp. 238–240 (2021). https://doi.org/10.1109/ISSCC42613.2021.9365958.
Zhang, H., Kang, W., Wang, L., et al.: Stateful reconfigurable logic via a single-voltage-gated spin hall-effect driven magnetic tunnel junction in a spintronic memory. IEEE Trans. Electron Devices 64(10), 4295–4301 (2017). https://doi.org/10.1109/TED.2017.2726544
Article Google Scholar
Zhang, H., Kang, W., Cao, K., et al.: Spintronic processing unit in spin transfer torque magnetic random access memory. IEEE Trans. Electron Devices 4, 1–6 (2019a). https://doi.org/10.1109/TED.2019.2898391
Article Google Scholar
Zhang, H., Kang, W., Wu, B., et al.: Spintronic processing unit within voltage-gated spin hall effect mrams. IEEE Trans. Nanotechnol. 18, 473–483 (2019b). https://doi.org/10.1109/tnano.2019.2914009
Article Google Scholar
Zhang, H., Liu, J., Kang, W., et al.: A 40nm 33.6tops/w 8t-sram computing-in-memory macro with dac-less spike-pulse-truncation input and adc-less charge-reservoir-integrate-counter output. In: 2021 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA). pp. 123–124 (2021). https://doi.org/10.1109/ICTA53157.2021.9661898.
Zhao, W., Chappert, C., Javerliac, V., et al.: High speed, high stability and low power sensing amplifier for mtj/cmos hybrid logic circuits. IEEE Trans. Magn. 45(10), 3784–3787 (2009). https://doi.org/10.1109/TMAG.2009.2024325
Article Google Scholar
Zhao, R., Song, W., Zhang, W., et al.: Accelerating binarized convolutional neural networks with software-programmable fpgas. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. pp. 15–24 (2017). https://doi.org/10.1145/3020078.3021741.
Zhou, X., Zhu, X., Chen, B., et al.: An 8-bit rram based multiplier for hybrid memory computing. In: 2019 IEEE International Workshop on Future Computing (IWOFC). pp. 1–3 (2019). https://doi.org/10.1109/IWOFC48002.2019.9078444.

Download references

Acknowledgements

This work is supported by the Beijing Nova Program from Beijing Municipal Science and Technology Commission (No. Z201100006820042 and No. Z211100002121014), National Natural Science Foundation of China (Grants No. 61871008). Lichuan Luo and He Zhang contributed equally to this work. On behalf of all authors, the corresponding author (Wang Kang) states there is no conflict of interest.

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Beihang University, Beijing, China
Lichuan Luo, Dijun Liu & Youguang Zhang
School of Computer Science and Engineering, Beihang University, Beijing, China
He Zhang
School of Integrated Circuit Science and Engineering, Beihang University, Beijing, China
Jinyu Bai & Wang Kang

Authors

Lichuan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Dijun Liu
View author publications
You can also search for this author in PubMed Google Scholar
He Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Youguang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinyu Bai
View author publications
You can also search for this author in PubMed Google Scholar
Wang Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wang Kang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, L., Liu, D., Zhang, H. et al. SpinCIM: spin orbit torque memory for ternary neural networks based on the computing-in-memory architecture. CCF Trans. HPC 4, 421–434 (2022). https://doi.org/10.1007/s42514-022-00108-w

Download citation

Received: 25 February 2022
Accepted: 26 May 2022
Published: 24 June 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s42514-022-00108-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SpinCIM: spin orbit torque memory for ternary neural networks based on the computing-in-memory architecture

Abstract

Access this article

Similar content being viewed by others

NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration

Analog In-Memory Computing with SOT-MRAM: Architecture and Circuit Challenges

Nanoscale resistive switching devices for memory and computing applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SpinCIM: spin orbit torque memory for ternary neural networks based on the computing-in-memory architecture

Abstract

Access this article

Similar content being viewed by others

NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration

Analog In-Memory Computing with SOT-MRAM: Architecture and Circuit Challenges

Nanoscale resistive switching devices for memory and computing applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation